What We Do
From information to knowledge
So you have a ton of unstructured data and want to extract insight. Or, you need to bring data into the cool analytics platform you have already built. We understand that your needs are unique, which is why we built a completely modular infrastructure allowing us to snap together components of our capabilities to provide you a tailored solution. We work with you to select from our capabilities, apply it to your specifications, then provide the output you need - whether it’s a private API into which you can make unlimited calls, or a customized real-time visualization.
EpiPulse
EpiPulse is our data science and development stack, a super-fast scalable Spark infrastructure, allowing parallel processing to thousands of machines simultaneously.
The EpiPulse pipeline is designed to aggregate, store, process, analyze, deliver, and visualize information. Each of these individual capabilities can be accessed directly by our team, clients, and technologies through custom application programming (API) and user interface (UI).
Acquire | We have assembled thousands of broad and specialized health datasets available for use. We can ingest any data format, structured or unstructured. If there is a novel data source you would like to import, we can help gather it for you in a research-ready format. Our army of robots can “scrape” data automatically in near real-time.
Store | Datasets require optimal storage depending on the size and nature. We have experience selecting the right storage solution, including archival at-rest, on-demand, serving web applications, and highly distributed systems. The cloud-based data servers we use are located in the United States, as well as several European countries (complying with privacy laws). All data have redundant backups.
Process | An extensive library of natural language processing and machine learning apps that can parse your structured and unstructured data. Examples include: inferential geo-tagging, coding to major ontologies (ICD, MedDRA, UMLS, etc.), semantic language tools, identifying duplicates, detecting sentiment, and stripping personally identifiable information. Most of these tools are available in nearly two dozen languages.
Analyze | Often you will need to run descriptive and inferential statistics on your data. Choose from our library of functions, or bring your own. We speak many statistical languages, so we can implement your existing code and optimize it for the data environment (SAS, Stata, R, Python, PHP, and others).
Visualize | We can return your processed data in formats that you can visualize in your own interfaces. But, if you would like a custom interface, say mobile-optimized, we can provide that. Our user interfaces have won many awards for their elegance and power.
Deliver | We can output and deliver the datasets in any format your organization needs, including static spreadsheets, RSS feeds, dynamic APIs, XML, JSON, and E2B. Static data exports, ongoing live access, subscriptions, and reports are available.
Secure | Security is baked in to every system we assemble, and protected with a commensurate level of security to balance access and vulnerability. Transmission of data can comply to established standards and legal requirements to ensure privacy and fidelity. We have validated options for your industry, including GxP for pharmaceuticals and others. We also can provide click-by-click audit trails, support for litigation holds, and other business processes. Your information is sensitive. If needed, we can “blind” these apps to make them zero-knowledge, meaning our machines can process your data in temporary memory without a human ever having access to it. No traces, no remnants, no worries.