CVEDIA is Powering AI with Synthetic Data

CVEDIA’s synthetic data services empower companies to develop autonomous applications and smart sensors much more quickly, affordably, and safely.

Artificial intelligence solutions are only as accurate as the datasets used to train them. However, accurate data collection can cost millions of dollars, be logistically difficult or impossible, result in significantly biased data, or all three, presenting enormous roadblocks for large enterprises that want to leverage AI, machine learning, or computer vision. The challenges of data collection don’t just impact companies’ bottom lines. They stand in the way of security, safety, and smart city applications that have the potential to save human lives and build safer, more sustainable public spaces.

Synthetic datasets, which are generated using big data and machine learning theory instead of through real-world data collection, were developed to address these problems. This “artificial” data is recently proving to produce the same results as real-world data– provided that developers are able to accurately simulate real-world conditions. CVEDIA has spearheaded that effort.

Eliminate sampling bias & simulate any real-world scenario

Using SynCity, a photorealistic simulator that creates ultra-realistic digital environments, CVEDIA generates accurate synthetic datasets that are optimized for neural network training and validation. With CVEDIA datasets, organizations can greatly reduce the need for real-world data collection, allowing them to develop autonomous applications and smart sensors much more quickly, affordably, and safely.

CVEDIA synthetic datasets are designed specifically for AI in that they’re purposely populated with large amounts of variety, which is required for AI to learn effectively as not to potentially lead to accidents or incorrect decisions. They can be used to simulate any type of scenario, including those that would be too dangerous or cost-prohibitive to run in the real world, such as extreme weather conditions, system failure events, and collisions.

“The average system is outfitted with hundreds of thousands of dollars worth of equipment to collect this data, so it can push projects out of the budget range for a lot of teams,” explains Arjan Wijnveen, CEO. For example, when training an autonomous drone system, it may need to be exposed to scenarios where it nearly collides with other drones to ensure the AI behaves correctly if this were ever the case. This creates a host of problems for the equipment - which can potentially break, and budget planning, as it’s difficult to know how many times the scenario will need to be repeated.

Working with partners such as FLIR, the world’s leading thermal sensor producer, CVEDIA is committed to optimizing synthetic data for training purposes. In tests comparing the performance of CVEDIA synthetic datasets against real-world data, the synthetic sets have at times outperformed “real” datasets.

Safety and Security Applications

AI solutions have become fundamental to national security - not only to alert organizations of possible network hacks, but also to train security applications.

The hidden vulnerability in our nation’s artificial intelligence development is that many of the datasets being used to train machine learning systems are inherently biased due to an often unrelated original purpose - what they were originally created for, and who created them. Even more concerning, many applications are currently trained using datasets generated in China, a major U.S. cyber adversary. This leaves security holes open for the introduction of malicious data - eg, data that may cause an application to misperform or potentially cause harm.

CVEDIA solves these problems by providing security organizations with complete datasets with full traceability to supplement real world data, fill in coverage gaps, and control for sampling bias. One use FLIR use case employed CVEDIA’s synthetic datasets to train sensors to detect drones flying overhead from the ground. In this case, CVEDIA’s data reduced false positives - a recognition of a drone when there isn’t one, significantly in comparison to their original dataset. An application like this gives airports the ability to ensure air traffic space is secure.

Smart City Applications

In addition to security, CVEDIA is committed to the use of AI to create a safer planet. CVEDIA provides high-quality synthetic datasets to urban developers and smart city organizations so that they can create safer, more livable cities. These datasets are used to run scenarios that allow developers to design efficient roadways and walkways, understand the impacts of potential changes to existing infrastructure, and identify dangerous intersections and other hazards.

Developers can choose from a vast library of existing datasets or work with CVEDIA to create their own. To ensure that the system being trained behaves correctly during edge-case scenarios, CVEDIA synthetic datasets can be used to simulate any type of climate, weather, lighting, or traffic conditions and train machine learning systems to identify city infrastructure, people, wildlife, bicycles, vehicles, and other objects.

In one scenario, a smart city client was having difficulty recognizing people when they were partially covered by another object, like a light post or mailbox. This type of system failure could lead to potential pedestrian accidents. CVEDIA was able to create a dataset consisting of only those situations, giving the client the ability to rework their algorithms to overcome the issue.

Using synthetic data for smart city design allows developers to simulate scenarios that would be too expensive or unacceptable to reproduce in real life, such as traffic accidents. The use of synthetic data also avoids the privacy and regulatory issues that can arise when data is collected from real-world devices, such as traffic cameras.

All CVEDIA projects are developed by a highly experienced in-house team that is headed by synthetic data industry leaders and includes AI veterans with backgrounds in machine learning R&D and large-scale synthetic data deployments. Each project is customized around the unique problems faced by their clients. For Arjan Wijnveen, CEO, “we’re a team full of people passionate about creating a better future with AI.”

This article does not necessarily reflect the opinions of the editors or management of EconoTimes.

CVEDIA is Powering AI with Synthetic Data

Editor's Picks

Welcome to EconoTimes