|
|
|
PEDSYS is a database system developed as a specialized
tool for management of genetic, pedigree and demographic
data. It has been designed principally for use with pedigree
analysis of either human or non-human subjects. The system
supports integrated collection, management and analysis of
constantly evolving data sets by investigators in several
different laboratories. Although some of the programs are
specialized, many are general enough to be used effectively
with data of any sort.
PEDSYS runs
under Sun Microsystems Solaris, Red Hat Linux, Microsoft
Windows 95/98/NT, and Apple Macintosh OS, and can readily
exchange data with other database systems, programs and
computers. Particular care has been taken to make it
possible for non-experts to use PEDSYS effectively without
extensive training, but at the same time to avoid irritating
the proficient user.
The PEDSYS
package consists of a set of file archives containing
programs, examples, and the PEDSYS User's Manual. PEDSYS may
be downloaded from these pages without cost, or acquired for
a nominal fee on conventional media (CD-ROM), delivered by
surface mail.
Functions
PEDSYS
comprises the programs listed below, organized into four
categories by function.
A.
General Data Management (16
programs)
|
|
AGE
|
|
Calculates
age in years, months, or days
|
|
BROWSE
|
|
Gives
formatted screen display of any file
|
|
CALC
|
|
Generates
functions of one or more items
|
|
DATES
|
|
Converts
between a variety of calendar date
formats
|
|
DUPLIC
|
|
Identifies
records with duplicate item values
|
|
LIST
|
|
Lists
contents of Master or Data File on
printer
|
|
MERGE
|
|
Merges
records from one file with records of another
file
|
|
NEWITEM
|
|
Changes
item entries as specified by a substitution
table
|
|
OUTLIER
|
|
Eliminates values lying above or below
bounds
|
|
REFORMAT
|
|
Reorders,
deletes, or reformats items of records in
file
|
|
REPORT
|
|
Lists
Multiple-entry records of individuals on
printer
|
|
SHOW
|
|
Displays
Single-entry records of EGO, FA and MO
|
|
SHOWDATA
|
|
Displays
Multiple-entry records of a single
individual
|
|
SORT
|
|
Performs
compound sort of records in file
|
|
SUBSET
|
|
Generates
compound subsets of records in file
|
|
TALLY
|
|
Generates
distributions and calculates means and
variances
|
B.
File Management (13 programs)
|
|
ALTERDAT
|
|
Updates
Data Files (edit, add records, start new
file)
|
|
ALTMASTR
|
|
Creates
Master File update file (edit, add records,
etc.)
|
|
APPEND
|
|
Joins
two files having the same Code File
specifications
|
|
CODE
|
|
Generates
a format Code File
|
|
COMBINE
|
|
Assembles
Multiple-entry values into single
records
|
|
DATALINK
|
|
Links
Data Files to Master Files (and internally as
needed)
|
|
DBFILER
|
|
Adds,
edits, removes Master and Data Files in
DBFILES
|
|
DELRECS
|
|
Deletes
records using output of ALTERDAT or
ALTMASTR
|
|
ITEMIZE
|
|
Determines
PEDSYS format of foreign ASCII data
records
|
|
SETDDATA
|
|
Updates
Data Files using output from ALTERDAT
|
|
SETMASTR
|
|
Updates
Master Files using output from ALTMASTR
|
|
TRANSLAT
|
|
Converts
between PEDSYS and foreign record
formats
|
|
VERIFIER
|
|
Checks
record format against Code File
specifications
|
C.
Pedigree Management (8 programs)
|
|
ANCESTOR
|
|
Finds
common ancestors for selected probands
|
|
COUNTPED
|
|
Identifies
pedigrees and assigns individuals to
them
|
|
FOUNDREP
|
|
Determines
contribution of founders to subsequent
generations
|
|
INDEX
|
|
Transforms
pedigree data into Master File format
|
|
KINSHIP
|
|
Calculates
inbreeding, kinship coefficients
(Quaas-Henderson)
|
|
MAKEPED
|
|
Constructs
extended pedigrees for specified
probands
|
|
PREPDRAW
|
|
Creates
Pedigree/Draw formatted file from PEDSYS
records
|
|
TRIPLETS
|
|
Assembles
data for EGO, FA and MO on same record
|
D.
Genetic Management (7 programs)
|
|
DOWNCODE
|
|
Simplifies
genotypic structure by combining alleles
|
|
GENEFREQ
|
|
Calculates
gene and genotype frequencies from phenotypic
data
|
|
GENOMAP
|
|
Generates
a Phenotype/Genotype map file
|
|
INFER
|
|
Infers
unknown genotypes from genotypes of
relatives
|
|
PEDTRIM
|
|
Simplifies
extended pedigrees for genetic analysis
|
|
PEELSEQ
|
|
Generates
a peeling sequence from an indexed pedigree
file
|
|
SEGCHECK
|
|
Tallies
offspring genotypes by parental mating
type
|

General
Design Considerations
Several
criteria governed the development of PEDSYS:
- An
important requirement was that PEDSYS support the
collection, management and analysis of data taken from
the same population by investigators in several different
laboratories.
PEDSYS databases have a relational structure organized
around a Master File containing basic information on
vital events and genealogical relationships for
individuals in the population. Master File records
incorporate as data a set of pointers (Sequential IDs)
that link each individual with key members of the nuclear
family. Thus, the Master File carries its own indexing
for family members, making it possible to construct
extended families without the need to access additional
files. A separate index file relates these pointers to
Permanent IDs of individuals.
Any number of independently developed Data Files can be
joined as needed to the appropriate Master File by means
of a multiple data pointer file containing pointers that
relate records in the two files. Data Files can be copied
or combined in various ways to form entirely new files
that maintain their linkage with the Master File.
A specialized Multiple Entry Data File, which (unlike
other PEDSYS files) may contain more than one record for
each individual, is used to ensure efficient storage of
data that may be measured repeatedly on the same
individual. Each record in these files contains a pointer
that links it to other records for the same individual in
the file.
This arrangement of files makes it possible for
investigators and technical staff to manage and analyze
data generated in their own projects, and at the same
time to combine these data (a) with a common pool of
demographic and genealogical information about the
population and (b) with data sets collected in other
projects.
- Another
critical need was the ability to import foreign data
(that is, files developed on other database systems), and
to export PEDSYS data to other programs or computers for
analysis.
This requirement was met by choosing a simple file
structure based on formatted records containing nothing
but data represented by printable ASCII characters. That
is, there are no file headers, record descriptors, or
control characters that must be processed (or avoided)
before accessing information kept in the database. Record
structure is defined by a separate Code File which
accompanies each Master File or Data File.
In addition, several of the PEDSYS General Data
Management programs are designed to transform or
reconfigure ASCII data into any required format with
considerable ease and flexibility.
- Although
we anticipated that some individuals would become highly
skilled in the use of the system, we felt that it was
important for non-experts to be able to use PEDSYS
effectively without extensive training. Experience has
shown that a common pattern for many scientists and
technicians is to use a set of PEDSYS programs
intensively for a matter of days or weeks during the
course of a project, after which weeks or months will
pass before the system is used heavily again. We
therefore paid particular attention to the development of
an orderly set of commands, prompts and instructions that
would make the programs as self-instructing as
possible.
A major effort has been made to keep the programs
consistent and easy to use. We have often chosen for the
sake of simplicity to write entirely new programs, rather
than add new functions to existing programs. As far as
possible we have kept the same conventions for keyboard
entry and screen displays in all programs, and we have
paid particular attention to our use of the English
language.
- PEDSYS had
to accommodate constantly evolving data sets. The
continual changes in form and content of research data on
growing populations and experimental samples required
that the user be able to change or experiment with the
composition and format of data quickly and without a
major restructuring of the database. Several features
make it simple to reorganize data: (a) The independence
of PEDSYS files means that any single Data File can be
created, changed or deleted without affecting another.
Master Files provide a stable core of basic information
that remains unaffected as Data Files are changed. (b) A
number of programs have been designed specifically to
automate the task of merging, reordering, and
transforming data.
- We feel
that large-scale software packages must not be dependent
on any one type of computer or operating system.
Consequently, PEDSYS programs are written in ANSI
Standard FORTRAN-77, with several library routines in
ANSI Standard C. Machine-dependent code has been isolated
and identified in the few cases where it has not been
possible to avoid altogether (chiefly file access and
screen display routines). Written originally for a
time-sharing Concurrent Computer (Perkin-Elmer) 3210, the
system has been adapted for Sun Microsystems Solaris, the
Macintosh OS, Windows 95/98/NT, and MS-DOS. We have
deliberately chosen to keep a minimally simple text-based
user interface based on specifications of the
quasi-standard VT100 terminal specifications so that
operations appear identical across hardware
platforms.
A
disadvantage of compiled programs that contain no
interpreted code is that the user cannot add new functions
to pre-existing programs (as can be done with many
commercial database systems). On the other hand, our
approach makes it easier to control and test the various
functions and algorithms used in the programs, as well as to
assure consistency and simplicity.
PEDSYS has
grown and evolved over the years in response to the
suggestions of the scientists and technicians who have used
the system. New programs are first written as quickly as
possible in response to the needs of a specific research
question. Previously developed program modules or algorithms
are incorporated wherever possible, but refinements are
added only as users gain experience with the program. This
approach has resulted in programs that meet a broad range of
requirements for pedigree data ma nagement
PEDSYS
programs require 15-20 MB of disk storage, depending on
the system in use. Population size limits: Solaris - 60000
individuals, Macintosh - 30,000 individuals, and DOS/Windows
- 30,000 individuals. There are no fixed limits to the
amount of data that can be maintained for each
individual.

Relationships
To Other Software
PEDSYS
programs and files are directly accessible by two other
packages used in management and analysis of colonies at SFBR
and elsewhere. These are:
- Pedigree/Draw
Pedigree/Draw is a set of programs which determine the
topology of a genealogical diagram from a list of related
individuals, automatically display and print the diagram,
and provide extensive editing capabilities. Current
limits are pedigrees of maximum size 1500, 25 offspring
per nuclear family, and up to 80 mates per individual.
Pedigree/Draw is currently available only on the
Macintosh, since the program relies on the graphics
capabilities built into this computer. PEDSYS programs
PREPDRAW and TRANSLAT permit interchange of PEDSYS and
Pedigree/Draw records.
- ACMP
ACMP is a management analysis package which includes (a)
detailed analysis of animal colony fertility and
mortality, (b) projection of future colony size,
demographic structure, and financial costs from computed
vital rates, (c) computation of an optimal harvest
schedule, and (d) simulation of interactions between
colony genetic and demographic structure. ACMP has
extensive graphics output, and currently is available in
its fullest development only on the Macintosh.
- Statistical
genetic analysis software
PEDSYS program TRANSLAT generates files formatted for
statistical genetic analysis programs SOLAR, FISHER and
MENDEL, S.A.G.E. (REGC), CRI-MAP, PAP, and IBDMAT.
- Delimited-field
text
PEDSYS program TRANSLAT reads and writes records
containing ASCII text data with fields delimited by tabs,
commas, or other characters, providing ready interchange
of data between PEDSYS and foreign software.

Copyright
And Limits On Use
The
development of PEDSYS represents a substantial and
costly effort that has been supported by research grants
from the U. S. National Institutes of Health. At the present
time, we must charge for production and distribution of the
User's Manual, but software distribution costs may be
avoided by users willing to download files over the
Internet. Proceeds from fees are put into an account
dedicated solely to the development and distribution of
software for the management of pedigree data used in genetic
research.
Although
this software is freely available on the Internet,
programs comprising PEDSYS Version 2.0 and the User's Manual
are not in the public domain, bear copyright notices dated
1989 or later, and are subject to the following conditions
of use: the programs are licensed to the user with the
understanding that they may not be leased, sold or
incorporated into other software packages without the
permission of the Southwest Foundation for Biomedical
Research.
We ask that
those receiving the package from sources other than the
Southwest Foundation notify us so that we may keep our
distribution list current. The intent of this request is to
avoid problems that occur when the software and manual are
out of date or incompatible.
We also ask
that users of the package report software bugs,
confusing instructions, missing functions, etc.,to us so
that we can continue to make improvements.

Postal
Delivery
For those
who have limited access to the Internet, or who do not
wish to transfer files electronically, we also distribute
the PEDSYS software and printed manual by post (First Class
Mail to addresses in North America, Airmail to overseas
addresses). The fee charged covers the cost of materials and
handling.
|
|
Prices
|
|
For
new users
|
|
|
CD-ROM
and User's manual:
|
$55
|
|
User's
Manual only:
|
$30
|
|
|
|
|
For
registered users
|
|
|
CD-ROM:
|
$20
|
|
User's
Manual only:
|
$25
|

|
|