We have been working on some pretty exciting updates and it's finally time to share! This is the biggest update to FloatDB since it's creation.
Since the beginning of FloatDB, one of our missions was to catalog every skin in the game and allow searching the database quickly. Project Vault furthers this with the largest data mining effort conducted on Steam (that we know of), enabling the community to gather greater insight into the economy.
Project Vault is a massive data mining effort of over 700 million CS:GO inventories fetched. This data feed will be released uniformly over the next week or two to FloatDB and will contain around 100 million new skins and the updated locations of another 100 million skins. We hope you're as excited as we are, and we have more data than just CS:GO skins cough Kato 14 cough...
Along with this, we're introducing Float Premium which allows you to see item history, duplicated items, Steam Market sale history and listing price, and more!
With that out of the way, if you'd like a deeper look into the details, feel free to read on.
History of Finding Skins
In order to understand the importance of this project, it's useful to look at how FloatDB finds new skins and updates items. When originally bootstrapped in April 2019, we used a friend graph mining algorithm to find inventories in the game. This method effectively crawled friends of Steam profiles, their friends, etc... This enabled us to quickly get 250 million items cataloged and was stopped once we could no longer find new friend links.
Unfortunately, while this search is more efficient in finding inventories that likely have CSGO items, it misses inventories that are not well connected in the Steam friend graph.
In addition to this, we also offer the CSGOFloat API which allows API consumers and the over 100k CSGOFloat Extension users to query item data for skins as they browse Steam inventories and the Steam Community Market. We get hundreds of millions of item data requests per month, which continually adds new items to the database.
Even with all these requests, we still had blind spots and were itching to finally achieve the holy grail of data sets...
Unlike our previous data mining efforts, we wanted the most comprehensive data set possible, which is only achievable by fetching (almost) every inventory on Steam.
You might think, how is that even feasible? Aren't there hundreds of millions of Steam accounts? Yeah, we're kinda crazy.
It first starts with figuring out how to iterate every account that has been created. You might have seen Steam IDs such as
76561197960265728 which represents the SteamID64 of the user. The lower bits of this ID actually contain an auto-incrementing account number.
This means that we can effectively generate the SteamID64 for the first, second, third, etc... accounts that were ever created on Steam. There's a bit more complexity to the SteamID64, but for most publicly created user accounts, it's pretty simple.
Fetching Nearly a Billion Inventories
Now that we can iterate over every possible Steam ID, we still need an effective method to actually fetch the inventories for each account. As you might already know, Steam inventories are very unstable, have heavy rate limits, and requires a game-coordinator connection to actually fetch item data such as the float value.
Thanks to @cantryde who tipped the existence of a method that was able to make these restrictions less annoying, but also have the ability to fetch some types of private inventories.
With that out of the way, we set out to complete the mining over a multi-week period conducted on multiple servers, eventually reaching our target Steam account numbers. We also wanted to ensure that our mining efforts didn't cause performance degradation for other users on Steam.
From there, we've now setup the infrastructure to ingest roughly 200 million new inspect links which will be added to FloatDB over the next week or two.
Not Just CS:GO Skins
While finding new skins that haven't been cataloged before is very exciting, we also saw this as an opportunity to find cold-hard data on some interesting metrics the community might like.
Most notably, we have been cataloging the locations of every Katowice 2014 unapplied sticker that we came across, and we'll be doing a full post on the actual amount of stickers in this category.
This also gives us the opportunity to tackle a problem we've longed to solve - showing duplicated items. Back in the day, Steam support used to duplicate items if a user was scammed, essentially creating multiple copies in the economy (this was abused a lot). Project Vault allows us to properly see how many duplicated skins there are and present this info.
We've been cooking up some updates that allow further insight into the CS:GO economy and FloatDB specifically. You can find more details here but let's show some key features:
This has undoubtedly been one of the major things we wanted to add to the database for quite some time. Item history allows you to see the lineage of an item as it is traded and sold.
We wanted to take it one step further however, so we've also cataloged data on the exact price that an item has historically been sold on the Steam Community Market. This should help you find a long-lost skin or bargain for an item.
Note: Item History is applicable to updates that occurred after 05/05/2020
Instead of starting at the standard 4.5% fee, Premium users start at 3.5% on Float Market.
If your sale volume permits a fee lower than 3.5% and you have premium, we'll use that fee instead. You can find the full table here.
Float Premium introduces new searching filters such as min/max update age, Steam Market listing price, whether the item has been sold on Steam Market already, collection search, and sorting by recently updated.
We've also made the Newest/Oldest ID sorting a premium feature in order to help prevent abuse.
Dupes and Listing Data
As we mentioned earlier, Project Vault has enabled us to collect data on how many duplicates of an item is in existence. With Premium, we now show a column indicating how many times an item was duplicated. You can go to the item history to find the locations of dupes.
We've also annotated a column with the current listing price of the item on Steam Community Market and whether it has already been sold.
We're excited to finally share this with you guys and hopefully you find use from Project Vault and Float Premium. Who knows what new #1 float ranked skins or 4x Katowice 2014 iBuyPower skins await us.
This has been a journey, and we're glad to be a part of it.