PEDSYS is a database system
developed as a specialized tool for management of genetic, pedigree and demographic data. It has been designed principally for use with pedigree analysis of either human or non-human subjects. The system supports integrated collection, management and analysis of constantly evolving data sets by investigators in several different laboratories. Although some of the programs are specialized, many are general enough to be used effectively with data of any sort.

PEDSYS runs under Sun Microsystems Solaris, Red Hat Linux, Microsoft Windows 95/98/NT, and Apple Macintosh OS, and can readily exchange data with other database systems, programs and computers. Particular care has been taken to make it possible for non-experts to use PEDSYS effectively without extensive training, but at the same time to avoid irritating the proficient user.

The PEDSYS package consists of a set of file archives containing programs, examples, and the PEDSYS User's Manual. PEDSYS may be downloaded from these pages without cost, or acquired for a nominal fee on conventional media (CD-ROM), delivered by surface mail.

Functions

Version 2.0 Update Summary

General Design Considerations

Copyright And Limits On Use

Relationships To Other Software

Postal Delivery

Download PEDSYS


Functions

PEDSYS comprises the programs listed below, organized into four categories by function.

A. General Data Management (16 programs)

AGE

Calculates age in years, months, or days

BROWSE

Gives formatted screen display of any file

CALC

Generates functions of one or more items

DATES

Converts between a variety of calendar date formats

DUPLIC

Identifies records with duplicate item values

LIST

Lists contents of Master or Data File on printer

MERGE

Merges records from one file with records of another file

NEWITEM

Changes item entries as specified by a substitution table

OUTLIER

Eliminates values lying above or below bounds

REFORMAT

Reorders, deletes, or reformats items of records in file

REPORT

Lists Multiple-entry records of individuals on printer

SHOW

Displays Single-entry records of EGO, FA and MO

SHOWDATA

Displays Multiple-entry records of a single individual

SORT

Performs compound sort of records in file

SUBSET

Generates compound subsets of records in file

TALLY

Generates distributions and calculates means and variances

B. File Management (13 programs)

ALTERDAT

Updates Data Files (edit, add records, start new file)

ALTMASTR

Creates Master File update file (edit, add records, etc.)

APPEND

Joins two files having the same Code File specifications

CODE

Generates a format Code File

COMBINE

Assembles Multiple-entry values into single records

DATALINK

Links Data Files to Master Files (and internally as needed)

DBFILER

Adds, edits, removes Master and Data Files in DBFILES

DELRECS

Deletes records using output of ALTERDAT or ALTMASTR

ITEMIZE

Determines PEDSYS format of foreign ASCII data records

SETDDATA

Updates Data Files using output from ALTERDAT

SETMASTR

Updates Master Files using output from ALTMASTR

TRANSLAT

Converts between PEDSYS and foreign record formats

VERIFIER

Checks record format against Code File specifications

C. Pedigree Management (8 programs)

ANCESTOR

Finds common ancestors for selected probands

COUNTPED

Identifies pedigrees and assigns individuals to them

FOUNDREP

Determines contribution of founders to subsequent generations

INDEX

Transforms pedigree data into Master File format

KINSHIP

Calculates inbreeding, kinship coefficients (Quaas-Henderson)

MAKEPED

Constructs extended pedigrees for specified probands

PREPDRAW

Creates Pedigree/Draw formatted file from PEDSYS records

TRIPLETS

Assembles data for EGO, FA and MO on same record

D. Genetic Management (7 programs)

DOWNCODE

Simplifies genotypic structure by combining alleles

GENEFREQ

Calculates gene and genotype frequencies from phenotypic data

GENOMAP

Generates a Phenotype/Genotype map file

INFER

Infers unknown genotypes from genotypes of relatives

PEDTRIM

Simplifies extended pedigrees for genetic analysis

PEELSEQ

Generates a peeling sequence from an indexed pedigree file

SEGCHECK

Tallies offspring genotypes by parental mating type


General Design Considerations

Several criteria governed the development of PEDSYS:

  1. An important requirement was that PEDSYS support the collection, management and analysis of data taken from the same population by investigators in several different laboratories.

    PEDSYS databases have a relational structure organized around a Master File containing basic information on vital events and genealogical relationships for individuals in the population. Master File records incorporate as data a set of pointers (Sequential IDs) that link each individual with key members of the nuclear family. Thus, the Master File carries its own indexing for family members, making it possible to construct extended families without the need to access additional files. A separate index file relates these pointers to Permanent IDs of individuals.

    Any number of independently developed Data Files can be joined as needed to the appropriate Master File by means of a multiple data pointer file containing pointers that relate records in the two files. Data Files can be copied or combined in various ways to form entirely new files that maintain their linkage with the Master File.

    A specialized Multiple Entry Data File, which (unlike other PEDSYS files) may contain more than one record for each individual, is used to ensure efficient storage of data that may be measured repeatedly on the same individual. Each record in these files contains a pointer that links it to other records for the same individual in the file.

    This arrangement of files makes it possible for investigators and technical staff to manage and analyze data generated in their own projects, and at the same time to combine these data (a) with a common pool of demographic and genealogical information about the population and (b) with data sets collected in other projects.

  2. Another critical need was the ability to import foreign data (that is, files developed on other database systems), and to export PEDSYS data to other programs or computers for analysis.

    This requirement was met by choosing a simple file structure based on formatted records containing nothing but data represented by printable ASCII characters. That is, there are no file headers, record descriptors, or control characters that must be processed (or avoided) before accessing information kept in the database. Record structure is defined by a separate Code File which accompanies each Master File or Data File.

    In addition, several of the PEDSYS General Data Management programs are designed to transform or reconfigure ASCII data into any required format with considerable ease and flexibility.

  3. Although we anticipated that some individuals would become highly skilled in the use of the system, we felt that it was important for non-experts to be able to use PEDSYS effectively without extensive training. Experience has shown that a common pattern for many scientists and technicians is to use a set of PEDSYS programs intensively for a matter of days or weeks during the course of a project, after which weeks or months will pass before the system is used heavily again. We therefore paid particular attention to the development of an orderly set of commands, prompts and instructions that would make the programs as self-instructing as possible.

    A major effort has been made to keep the programs consistent and easy to use. We have often chosen for the sake of simplicity to write entirely new programs, rather than add new functions to existing programs. As far as possible we have kept the same conventions for keyboard entry and screen displays in all programs, and we have paid particular attention to our use of the English language.

  4. PEDSYS had to accommodate constantly evolving data sets. The continual changes in form and content of research data on growing populations and experimental samples required that the user be able to change or experiment with the composition and format of data quickly and without a major restructuring of the database. Several features make it simple to reorganize data: (a) The independence of PEDSYS files means that any single Data File can be created, changed or deleted without affecting another. Master Files provide a stable core of basic information that remains unaffected as Data Files are changed. (b) A number of programs have been designed specifically to automate the task of merging, reordering, and transforming data.

  5. We feel that large-scale software packages must not be dependent on any one type of computer or operating system. Consequently, PEDSYS programs are written in ANSI Standard FORTRAN-77, with several library routines in ANSI Standard C. Machine-dependent code has been isolated and identified in the few cases where it has not been possible to avoid altogether (chiefly file access and screen display routines). Written originally for a time-sharing Concurrent Computer (Perkin-Elmer) 3210, the system has been adapted for Sun Microsystems Solaris, the Macintosh OS, Windows 95/98/NT, and MS-DOS. We have deliberately chosen to keep a minimally simple text-based user interface based on specifications of the quasi-standard VT100 terminal specifications so that operations appear identical across hardware platforms.

A disadvantage of compiled programs that contain no interpreted code is that the user cannot add new functions to pre-existing programs (as can be done with many commercial database systems). On the other hand, our approach makes it easier to control and test the various functions and algorithms used in the programs, as well as to assure consistency and simplicity.

PEDSYS has grown and evolved over the years in response to the suggestions of the scientists and technicians who have used the system. New programs are first written as quickly as possible in response to the needs of a specific research question. Previously developed program modules or algorithms are incorporated wherever possible, but refinements are added only as users gain experience with the program. This approach has resulted in programs that meet a broad range of requirements for pedigree data ma nagement

PEDSYS programs require 15-20 MB of disk storage, depending on the system in use. Population size limits: Solaris - 60000 individuals, Macintosh - 30,000 individuals, and DOS/Windows - 30,000 individuals. There are no fixed limits to the amount of data that can be maintained for each individual.


Relationships To Other Software

PEDSYS programs and files are directly accessible by two other packages used in management and analysis of colonies at SFBR and elsewhere. These are:

  1. Pedigree/Draw

    Pedigree/Draw is a set of programs which determine the topology of a genealogical diagram from a list of related individuals, automatically display and print the diagram, and provide extensive editing capabilities. Current limits are pedigrees of maximum size 1500, 25 offspring per nuclear family, and up to 80 mates per individual. Pedigree/Draw is currently available only on the Macintosh, since the program relies on the graphics capabilities built into this computer. PEDSYS programs PREPDRAW and TRANSLAT permit interchange of PEDSYS and Pedigree/Draw records.

  2. ACMP

    ACMP is a management analysis package which includes (a) detailed analysis of animal colony fertility and mortality, (b) projection of future colony size, demographic structure, and financial costs from computed vital rates, (c) computation of an optimal harvest schedule, and (d) simulation of interactions between colony genetic and demographic structure. ACMP has extensive graphics output, and currently is available in its fullest development only on the Macintosh.

  3. Statistical genetic analysis software

    PEDSYS program TRANSLAT generates files formatted for statistical genetic analysis programs SOLAR, FISHER and MENDEL, S.A.G.E. (REGC), CRI-MAP, PAP, and IBDMAT.

  4. Delimited-field text

    PEDSYS program TRANSLAT reads and writes records containing ASCII text data with fields delimited by tabs, commas, or other characters, providing ready interchange of data between PEDSYS and foreign software.


Copyright And Limits On Use

The development of PEDSYS represents a substantial and costly effort that has been supported by research grants from the U. S. National Institutes of Health. At the present time, we must charge for production and distribution of the User's Manual, but software distribution costs may be avoided by users willing to download files over the Internet. Proceeds from fees are put into an account dedicated solely to the development and distribution of software for the management of pedigree data used in genetic research.

Although this software is freely available on the Internet, programs comprising PEDSYS Version 2.0 and the User's Manual are not in the public domain, bear copyright notices dated 1989 or later, and are subject to the following conditions of use: the programs are licensed to the user with the understanding that they may not be leased, sold or incorporated into other software packages without the permission of the Southwest Foundation for Biomedical Research.

We ask that those receiving the package from sources other than the Southwest Foundation notify us so that we may keep our distribution list current. The intent of this request is to avoid problems that occur when the software and manual are out of date or incompatible.

We also ask that users of the package report software bugs, confusing instructions, missing functions, etc.,to us so that we can continue to make improvements.


Postal Delivery

For those who have limited access to the Internet, or who do not wish to transfer files electronically, we also distribute the PEDSYS software and printed manual by post (First Class Mail to addresses in North America, Airmail to overseas addresses). The fee charged covers the cost of materials and handling.


Prices

For new users

CD-ROM and User's manual:

$55

User's Manual only:

$30

For registered users

CD-ROM:

$20

User's Manual only:

$25



Last modified August 23, 2004

Genetics | SFBR Home