The Two Sigma senior project is to create a data ontology system to discover and exploit relationships between data gathered from disparate sources to enable Two Sigma to develop better business insights from existing data and systems.
The system consists of several components: the collector API, ontology cache, query and reporting engine, command line interface, web interface, and a security manager.
The collector API will be responsible for aggregating all data sources into a common data format. This data will then be stored in the ontology cache. The query and reporting engine will be used by users to query and create reports based on the knowledge base stored within the ontology cache. The command line and web interfaces will be used to interact with the query and reporting engine, providing a convenient method for interacting with the system. Lastly, the security manager will be responsible for enforcing both document and user-level security concerns. Semantic web technologies such as OWL and RDF will be used as the building blocks for many of these components.
The final product will be a system that allow users to discover previously unknown relationships between artifacts and develop previously unavailable business insights. This system will be highly extendable by allowing users to add additional relationships between artifacts and gain new business insights as an increasing number of relationships are added to the system ontology.