|
|
Sample Projects
The following is a sample of some recent custom systems we developed for our clients:
- A knowledge base of post-translational modifications
- Product development platform for a biological reagent company
- Array Repository Data Analysis System for the department of defense
- Knowledge management system for a drug discovery company
- LIMS for a microarray facility
- Sequencing pipeline for a genomics company
- Public gene expression database
- LIMS for a MS proteomics facility
- LIMS for biological sample provider
- Curation of annotations for genes
|
|
|
|
A knowledge base of post-translational modifications
a) Requirements:
A company had developed a curated resource for protein modification events and they effect on cellular processes. The system suffered from several ills including poor performance, inadequate curation interfaces, a limited security model, weak search capabilities, and the inability to display all the data stored in the database. In addition to curated data, high throughput data from a novel characterization method of protein modifications based on mass spectrometry needed to be accommodated in the system. Although most of the current data was for protein phosphorylation, large amount of data for other types of protein modifications were expected to be loaded in the system at an ever increasing throughput.
b) Our solution:
The database for the system was restructured significantly. New important concepts missing from the initial version of the system were introduced and fully support for multiple types of protein modifications was added. The restructuring included many features that resulted in a several fold improvement of query performance. The software for the system was completely rewritten using current generation enterprise technologies. In the reengineered system, new user interfaces enable users to formulate sophisticated queries based on treatments, diseases, cell lines, and tissues, and result in retrieval of bulk data sets. These bulk data sets can be downloaded for analysis and model building. There are provisions for bulk queries that will allow high throughput labs to determine whether sites from their modification site discovery programs are novel. Interactive viewers for protein structures and the modification sites were incorporated in the system. Also, an export of the database as a graph of biological objects is available for analysis of the data in a pathway tool. top
|
|
|
|
Product development platform for a biological reagent company
a) Requirements:
A leading developer and producer or biological reagents had reached the limit of FileMaker’s data management capabilities. They needed a comprehensive data and business process management solution that would support them through their rapid growth. Due to the diversity of their portforlio of reagents, including but not limited to polyclonal and monoclonal antibodies, siRNA, purified protein kinases, small molecules, and peptides, they requested a flexible system capable of recording highly diverse data. Product developers, product managers, laboratory scientists, marketing and graphics staff, receiving and shipping staff, and senior management all required means to record and organize mission critical information and to search and mine this information in flexible and secure ways. In addition, the system should facilitate the transfer and tracking of samples to and from contractors, the transfer of product data to the company’s corporate web site, and the publication of high quality product data sheets.
b) Our solution:
The Product Development Platform is an enterprise system that supports all aspects of reagent development and production. It assists product development through a comprehensive set of Laboratory Information Management System interfaces and functions. Specialized groups such as those performing immunohistochemistry or flow cytometry assays are supported through dedicated modules. The system supports receiving barcoded samples from suppliers and aids the management of freezer inventory. It organizes products in a unified repository that unifies all product types under a unified paradigm. It tracks lots of products and the assays that characterize the lots. It manages the process of releasing a product for sale and is integrated to the financial and e-commerce systems. Finally, it is tightly integrated with Adobe InDesign for automated production of product data sheets.
top
|
|
|
|
Array Repository Data Analysis System for the department of defense
a) Requirements: Biological researchers within the Department of Defense
leverage functional genomics technologies for a wide ranges of studies such as the identification
of novel targets and therapeutics for Malaria, the dissection of host-response pathways for
HIV-Malaria co-infection, the development of an HIV vaccine, or the identification of biomarkers
and the development of novel therapeutics to counter the effect of chemical and biological
toxins. The data collected from Affymetrix and De Novo spotted arrays through large scale
experiments needs to be tracked, shared with authorized collaborators, and analyzed to extract
important biological knowledge. By identifying causal relationships between stimuli and differential
expression and incorporating prior knowledge from public domain and proprietary sources of structured
and unstructured information, researchers can build association models of biology for their systems
of interest.
b) Our solution: ARDAS is an information system to build models of gene
expression by assisting in the management of both experimental and analytical microarray data. ARDAS
is comprised of three modules: the Warehouse, the LIMS, and the AIMS. The Warehouse stores all the
information derived throughout the microarray experimental and analytical workflows. The Laboratory
Information Management System (LIMS) module records information associated to the array printing and
hybridization workflows. This information can then be passed to the Warehouse and transferred to the
Analysis Information Management System (AIMS) to determine which genes are differently regulated for
biological reasons. These lists of genes and expression values can be then stored back into the
Warehouse and queried to build biological models of expression.
top
|
|
|
|
Knowledge management system for a drug discovery company
a) Requirements: A drug discovery company needed
a Knowledge Management System (KMS) to capture the information generated by
cross-functional research groups. The company was shifting its focus from
a pure service model to an integrated drug discovery infrastructure and needed
to leverage its research and discovery capability more effectively. The principal
aim of the project was to capture and centralize the knowledge generated by
the scientists in the several divisions, and to organize that knowledge such
that it can be easily mined, browsed, and navigated. By providing a common
platform to all scientists in the organization, KMS enables serendipity by
providing comprehensive views of biological and chemical entities, fosters
collaboration between scientists in different functional groups by defining
a common framework for all groups, supports partnerships with other pharmaceutical
companies by providing data delivery mechanisms and clear separation of intellectual
property, and facilitates decision making through detailed tracking of projects
and programs.
b) Our solution: We specified, designed, and developed an enterprise-wide Knowledge Management System that captures and
organizes the results of experimental work in a unified framework. This information
is cross-referenced to an extensible model for biological and chemical entities.
Rich annotations and relationships are loaded automatically for these entities
from the public domain or through curation by scientists. Researchers are
able to search this knowledge base through sophisticated data mining functions.
The system is also capable of automatically generating new information by
automatically running computational biology tools and recording and processing
the results from the tools. The users can review the generated data, track
the status of the work done across multiple projects and programs, annotate
any object in the system, and associate supporting materials as attachments.
The system also has the capability to automatically create links to many outside
sources such as Genbank or Medline.
top
|
|
|
|
LIMS for a microarray facility
a) Requirements: A large biotechnology company
needed a Laboratory Information Management System (LIMS) for tracking the
operation of its central microarray facility. The facility was a user of the
Affymetrix platform and designed, spotted and hybridized spotted chips as
well. They wanted the LIMS to record all the steps in the microarray workflow,
RNA labeling, hybridization, scanning as well as the construction of spotted
chips. The ability to constrain the work in the laboratory based on the quality-control
assessment of samples and chips was required. The LIMS was mandated to capture
raw and normalized expression data calculated from the images for the hybridized
chips, provide a sophisticated search interface and include extensive reporting
capability.
b) Our solution: We created a LIMS with a common
database model and a unified set of user interfaces for the Affymetrix and
spotted array platforms. Through the LIMS, users can organize their data in
a project hierarchy and track all of the operations in the laboratory. To
expedite data entry, user interfaces support a batch mode where data from
multiple procedures, e.g., the hybridization of several chips, can be entered
in a single form. The LIMS incorporates a datamart that stores the expression
data and provides high performance search functionality.
top
|
|
|
|
Sequencing pipeline for a genomics company
a) Requirements: A genomics company had exceeded
the capacity of its sequence-processing pipeline. They required a new system
that was scalable, automated, flexible, and fault tolerant. The system had to
seamlessly process the chromatograms generated by the sequencing laboratory
through a set of bioinformatics tools with user-specified parameters, and deliver
the resulting data to the end users. The system had to allow for user-defined
processing and delivery, to automatically compute and store reads' statistics,
to store all chromatograms and FastA files in a secure central repository, and
to automatically notify end-users of relevant events.
b) Our solution: We designed and developed
an automated, highly scalable, and fault tolerant sequencing pipeline that
validates and processes in real time the chromatograms produced by the sequencers.
The system computes statistics based on user preferences and stores all results
in a centralized and secure database. The input files and the files created
during processing are stored in a file repository. The files and the statistics
for the reads are distributed based on registered user instructions. A highly
flexible reporting tool and a comprehensive security model enable the users
to search the database for data they are authorized to access and to view
the associated sequence files. Reports can be scheduled to run periodically
and are returned to users as either text, HTML or XML.
top
|
|
|
|
Public gene expression database
a) Requirements: A consortium of academic researchers
from ten different research organizations around the world needed a secure,
easy to use system for sharing, integrating, and extracting data to facilitate
their academic collaboration. These scientists were conducting a series of coordinated
microarray gene expression experiments on several different model organisms
to study polyglutamine-expansion neurodegenerative disorders. Specific requirements
included support for coordination of experimental design, the ability to record
the biological context of samples, extensibility to manage tens of gigabytes
of raw data and thousands of files, and support for sophisticated ad hoc queries
involving biological context, gene annotation, and expression values.
b) Our solution: We developed a centralized,
web accessible data repository with data loading, data extraction and sophisticated
search capabilities through an intuitive user interfaces. The system enables
researchers to compare and analyze expression data generated from different
disease models. It also supports complex querying based on the biological
context of the samples and gene expression criteria. The system parses and
extracts files uploaded from multiple sites into data series from the coordinated
experimental designs (the system currently contains over 15,000 data files
generated from microarray experiments). A robust data security and administration
capability provides flexible, secure data sharing and data access to many
classes of users. It is described by one of the lead users as, “A very robust
system that meets our complex research needs exceptionally well.”
top
|
|
|
|
LIMS for a MS proteomics facility
a) Requirements: A large-scale MS proteomics facility
needed a LIMS to track the operation of its laboratories. The laboratories operated
as an industrial facility where many samples were simultaneously characterized
through several workflows and a large battery of instruments operated continuously.
In addition to tracking samples, containers, reagents, consumables, and the
execution of tasks, the LIMS was required to include components for inventory
management, equipment part tracking, equipment maintenance tracking, workstation
configuration, and suppliers contact information.
b) Our solution: We implemented a flexible workflow
system that allows users to define their own workflows and to execute these
workflows in the laboratory. Although the underlying database and the associated
software was highly sophisticated, the user interfaces presented to the end-users
were straightforward and intuitive. Of the many tasks that might be active
simultaneously in the laboratory, technicians would only see those assigned
to them. Laboratory managers could access information on all of the tasks
and gauge in real time the productivity of the laboratory. Extensive support
for quality control information provided metrics for the quality of the work
for the laboratory, individual technicians, or workstations.
top
|
|
|
|
LIMS for biological sample provider
a) Requirements: A clinical genomics company needed
a Laboratory Information Management System (LIMS) for tracking the verification
of pathology reports for diseased tissue samples. They wanted the LIMS to manage
all aspects of the laboratory operation, including container and samples location
and barcodes, standard operating procedures, experimental parameters for all
tasks, and results produced by instruments, e.g., images of the stained samples
or pathology reports.
b) Our solution: We created a web-enabled LIMS
that tracks the complete histology and pathology verification processes. The
workflow is initiated when tissue slides are generated from paraffin-embedded
or frozen samples. The slides are tracked at each workstation in the histology
laboratory where they are prepared for the verification step. The LIMS provides
laboratory managers with statistical and throughput reports that provide the
information required for managing and optimizing operations. After the samples
slides pass the histology laboratory quality control checks they are provided
to the pathologists. As part of the pathology verification process, the microscopic
features and slide images are recorded in the LIMS. The workflow completes
when samples are sent back to storage. Although a sample may be fully processed,
the location of each slide generated from a sample remains available in the
LIMS.
top
|
|
|
|
Curation of annotations for genes
a) Requirements: A company desired to extend the
functionality of an existing software product to enable the tracking, storing,
and versioning biological objects such as sequences, annotation or sequence
clusters. They also needed a search engine that would operate at the level of
object versions. Scientists would then be able to pursue work on different versions
of the same objects and keep track of their preferred versions in a project
context, independently of other users. The underlying database was required
to store query snapshots and be able to execute queries in different project
contexts and across several preferred versions of the same object.
b) Our solution: We create a model for representing
a hierarchy of projects that allows for the versioning and tracking of biological
objects and of their relationships. This structure permits the scientists
to work on multiple versions of the same objects simultaneously. It also allows
an organization to reach a consensus on the preferred versions of objects
without affecting the work of individual researchers. The system also facilitates
the construction of contextual queries that are dynamically maintained for
optimum performance.
top
|
© 2005-2010 3rd Millennium, Inc.
>
|