White elephants are larger than life, rare, and beautiful to behold. Their unrivaled beauty and power make them more than mere symbols of prestige, but endow them with great economic value. 

They’re also incredibly burdensome and costly to maintain. 

I can’t imagine a better analogy for street-level data.

In this week’s blog I’ll be diving into street-level geospatial data, describing what it is, why, like the white elephant, it is so valuable and burdensome, and the approaches folks in the industry are taking to produce this data today. 

 

What is Street-Level Data?

Street-level data is a category of geospatial data collected from mobile sensors positioned around the height of a car. Google’s Street View is probably the most well-known street-level data product, followed by Microsoft Bing’s Streetside product. I use “street-level” as a more general term encompassing data from vehicle-mounted, backpack, mobile device, and other sensors collecting data on the external environment from the ground. 

Street-level data can describe a single data type such as imagery, but it is most valuable when imagery is accompanied by additional data types including:

  • 360-degree high-definition imagery: This is produced by using 4 or more cameras to collect imagery at the same time. The result is a 360 view of the entire street from a single pass.
  • LiDAR: Laser-generated point clouds add texture to your data set, making the detection of distances, people, cars, and infrastructure easier. 
  • IMUs: Inertial Measuring Units measure changes in velocity to determine positioning and orientation to help align images when they are stitched together.
  • GPS and GNSS: These positioning technologies help with geolocating the collected data. 
  • Wheel Encoders: While satellite-derived positioning is helpful, precision can be lacking, especially in cities with tall buildings. Wheel encoders measure tire rotation to determine distance traveled and improve positioning accuracy. 
  • Radar: Radio waves are used to detect objects and vehicles around the collection vehicle, especially in adverse weather conditions.
  • Ultrasonic Sensors: These are used to detect close proximity objects, especially at low speeds. 

For this article, I’ll be using street-level data to refer to data collected from a single platform containing all the sensor types above. Let’s dive into why street-level data is so valuable and the burdens that come with originating it. 

 

Why is Street-Level Data So Valuable?

All roads lead to Rome. That in a nutshell is why street-level data is so powerful.

Today, just as in the first century AD, the built world is primarily navigated by vast interconnected road networks.  With few exceptions in remote or sunken environments, you can get to most places on a road. This is why almost every building, lot, port, you name it, has an address whose primary reference is the name of the road upon which you can get there. 

At this point you might be asking, “that’s great and all Jordan, but can’t we just take pictures of roads from satellites and know where they all are in an instant?”. Yes, you can. 

But how do you know the names of the roads? How do you determine the address of the buildings? How can you find out a business name, its hours of operation, whether it uses Uber Eats or Grubhub? How can you determine the health of a telephone or light pole from a satellite or aerial image (there are ways to do this… but I digress)?

This is where street-level data starts to shine. Not only can you access most of the built world by road, but in doing so you can capture a level of accuracy far exceeding that of other geospatial data collection platforms. 

Satellites can collect large areas, but at relatively low resolution. Manned aircraft improves resolution, but even oblique imagery can’t capture storefronts and street signs with the same fidelity as street-level. Drones can cover more area with similar resolution, but can’t support the same payload, limiting the data types you can collect in one pass.

Back to what you might be thinking: “Okay, okay, so street-level data is cool and all, but Google, Bing, OpenStreetMap, and others already have all this stuff out there, can’t you just use theirs?” Yes, you can, but they’re, well, theirs and you’re contractually limited to what you can do with their data. Even OpenStreetMap and other open data sources come with strings attached, limiting the commercial applications of derivative data sets and products. 

And here is where proprietary street-level data takes the geospatial cake: with it (and the power to process it…) you can build a map and do anything you want with it. From helping cities, utilities, and telcos maintain their infrastructure, enabling autonomous vehicle navigation, improving AR persistence, and optimizing last mile logistics for LTL carriers to studying urban change, empowering application developers to build upon your platform, safeguarding property rights, and improving emergency response capabilities, street-level data is a fundamental building block of modern maps.

It’s beautiful, powerful, and extremely valuable. But like the white elephant, street-level data can be burdensome, especially if you don’t have a clear strategic path toward monetization. 

 

Street-Level Data The White Elephant

Street-level data is among the least scalable geospatial data types to acquire and maintain because it takes more time and human effort to acquire an area of interest (AOI) than almost any other remote sensing data type. 

This white elephant data set takes more time to collect because you have to physically drive or navigate road networks to get it. Not only are you limited to peak daylight hours (no dawn or dusk collecting!), but you’re at the mercy of traffic, construction, weather, and those pesky school zone speed limits. 

Furthermore, street-level collection at scale requires human operators to manage the sensor systems, operate the vehicles, and safely store these valuable systems at the end of the day. While humans with the proper training can quickly troubleshoot issues with sensors and vehicles, they’re accompanied by labor laws dictating break time and overtime pay (which are great for your drivers and something you should provide even without the government threat of force). But this means more time and money is required to collect large areas. Plus, unlike aerial collection, the cost of your human operators only increases as data collection extends further and further from their homes, requiring either new operators or hotels, per-diem, and other niceties to support your trained experts.

That being said, autonomous vehicles are on the rise, meaning the scalability of street-level data may be supercharged sometime soon. But unless you produced the AVs, my bet is you’ll either be collecting data for a competitor at the same time, or run into licensing issues for data collected on the back of someone else’s white elephant. Yet even without legal challenges, AVs will not only need regular costly maintenance like tire changes, but will likely still require human operators onboard to troubleshoot issues, fill their tanks (or plug them in), and clean off sensors after they get splashed with mud!

That’s all just the collection effort. Once you’ve collected your data, you need to ingest it, process it, QC it, and integrate it with other data types to make it truly valuable. Moving terabytes of data from each sensor each day is no easy task. You can ship hard drives, or rely on high-speed internet connections. But even if you have reliable fiber at one location, will you have access at the next? Then where do you process all this data to stitch it together, geolocate it, extract objects and text, and QC it to ensure the quality is not only marketable but regulatory risks around privacy are minimized? How do you train your ML models to extract objects with market-grade accuracy? How do you let the vehicle operators know where they need to go back and recollect in the event there’s an issue with the data?

I don’t mean to belabor this burden point too much, although we’re only scratching the surface. 

At the end of the day, it’s not enough to just drive every road within your AOI and process all the resulting data, a feat in and of itself. Things change, businesses move, new roads are built, infrastructure decays… You must refresh your data with some regularity to keep it accurate and valuable. 

Multiple corporate goliaths have attempted to harness this white elephant only to transition away from their proprietary data collection efforts years later. Google, Microsoft, Apple, and Uber are among them. But street-level data collection continues, let’s dive into the different ways to make it happen.

 

How is Street-Level Data Collected Today?

Street-level data is too valuable for geospatial giants to be dissuaded from acquiring it altogether by its burdensome nature. Today, there are four primary ways to collect street-level data: first-hand, outsourced hired guns, client-driven, and crowdsourced. The method used is highly dictated by the strategic objectives of the organization seeking to acquire street-level data. Let’s explore each collection method in turn. 

 

First-Hand

Organizations can build or buy sensors, buy or lease a fleet of vehicles, and hire or contract operators to drive their sensors around AOIs to collect street-level data. I call this first-hand acquisition. First-hand acquisition describes collection operations directly funded by the organization interested in producing street-level data products. 

First-hand collection produces the most versatile and propriety of data sets because the organization doing the collection can control processes, quality, use, and monetization of the data unencumbered by license restrictions. It’s also the most expensive way to acquire street-level data.

But once you have the data, it’s yours. If you’re trying to build a proprietary base map, or leverage one data type for a host of derivative products, first-hand acquisition may be the best option for you. It’s how Google, Bing, Uber, and Apple built truly customizable proprietary base maps.

 

Hired Guns

It’s hard to beat the efficiency of specialization. Companies explicitly focused on street-level data collection can be used as hired guns to collect data for you at regular intervals. They’ll leverage their experience and often global presence to navigate the complexities of street-level data collection so you can spare yourself the headache. 

This data is often still quite expensive, comes with use restrictions, and you often don’t have control over how or where it is collected, making it less versatile for your specific use case. 

If you already have a map and just need data refreshed from time to time, or need data to fuel internal product development and you don’t plan on selling the street-level data directly, hired guns are a great option. The giants previously mentioned have transitioned to this option and engineering firms have long used it, engaging companies like Tom-Tom, Cyvl.Ai, and Cyclomedia to fulfill their ongoing data requirements.

 

Client Driven

Some companies will develop sensor and processing technology and sell or lend their use to clients. Sometimes the sensor and processing provider can retain some rights to the data produced. This is a way to collect street-level data leveraging someone else’s fleet of vehicles and drivers, thereby reducing costs. 

The challenge here is buttoning up your system to the point where the garbage man whose truck it’s strapped to doesn’t get distracted from his already strenuous job on account of having to wipe your lenses every few minutes. There are also contractual challenges around data licensing and use. If the customer is paying for your services, isn’t it their data?

While this strategy is less costly, it comes with greater risks to data versatility, quality, and coverage (fleet vehicles don’t often scour every nook and cranny but stick to regular routes). It’s also less focused on building broader data products/platforms, and more focused on providing data collection and processing services to end clients.

If you’re resource-constrained and not too worried about having full control over your sensors and data, this is a great route to take to start generating cash flow as you improve your product. Startups such as Cyvl.AI, City Detect, and Hyperspec AI leverage this model. 

 

Crowdsourced

One of the sexiest methods for data collection for those controlling the budget, but scariest for those responsible for quality products, is crowdsourced data. There are few easier and more cost-effective ways to gather a ton of questionable data quickly. 

Companies like Hivemapper have built sensors you can buy, stick to your dashboard, and map areas on your own. They get data, you get rewarded with their crypto token “Honey”. They incentivize unique collection by adjusting the reward for uncollected roads. Hivemapper has collected an impressive 51.5 million kilometers total, of which 4.93 million kms are unique, within 8 months.

Similar to client-driven, crowdsourced collection is a great way to continuously map thoroughfares quickly and on a budget. But it’s challenging to maintain quality and reach places less traveled.

While these are the primary methods of collection employed today, there’s ample room for innovation and optimization when it comes to street-level data collection. One thing is certain, it’s vital to understand your strategic business objectives when selecting the method(s) of data collection you’re going to employ. 

 

Conclusion

If you haven’t guessed already, I’m extremely bullish on the value of street-level data. Not only for end customers who need innovative ways to plan, navigate, and maintain their infrastructure, but for geospatial technologists interested in making truly powerful and versatile base maps. Moreover, as digital twins continue to gain prominence, street-level data will be integral to producing truly immersive experiences across AR, VR, and traditional digital environments. 

Street-level data’s degree of detail and ability to reproduce the human perspective is unrivaled by any other geospatial data type. But much like the white elephant, acquiring and maintaining street-level data can be burdensome. It’s critical to carefully consider your strategic business objectives while developing your street-level data strategy.