CSE498, Collaborative Design, Fall 2016
Computer Science and Engineering
Michigan State University

Avata Intelligence leads the security industry in artificial intelligence and advanced analytics, supporting world organizations with integrated data-driven solutions.

For example, law enforcement units can use Avata’s platform to analyze crime records to predict when and where future crimes will occur. Rather than patrolling randomly or uniformly, officers can patrol when and where crimes are more likely to occur, thereby increasing safety and security.

Accurate analysis of crime records depends on having accurate data. Slightly different entries from different sources often represent the same crime. If there are duplicate copies of the same incident, the system may falsely predict this crime to be more common than it truly is. Unfortunately, such datasets are way too large to be checked manually for duplicates.

Our Dataset Merger Tool is a web app that automatically identifies and merges duplicate records within and across datasets in the Avata platform.

After a user selects data sources to be merged, our system uses advanced algorithms to identify duplicate records. When records are deemed to be duplicates above a certain confidence level, the user is presented with the duplicates and prompted for approval before the records are merged.

Upon completion, our system produces a report containing information useful for analyzing the resulting data integrity.

Our Dataset Merger Tool is written in ReactJS for the front-end and Java for the back-end, utilizing the Spring Boot framework. Datasets are stored in a MySQL database.