Strichman-Almashanu LZ, Bustin M, Landsman D.Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA.Retroposed copies (RPCs) of genes are functional (intronless paralogs) or nonfunctional (processed pseudogenes) copies derived from mRNA through a process of retrotransposition. Previous studies found that gene families involved in mRNA translation or nuclear function were more likely to have large numbers of RPCs. Here we characterize RPCs of the few families coding for the abundant high-mobility-group (HMG) proteins in humans. Using an algorithm we developed, we identified and studied 219 HMG RPCs. For slightly more than 10% of these RPCs, we found evidence indicating expression. Furthermore, eight of these are potentially new members of the HMG families of proteins. For three RPCs, the evidence indicated expression as part of other transcripts; in all of these, we found the presence of alternative splicing or multiple polyadenylation signals. RPC distribution among the HMGs was not even, with 33-65 each for HMGB1, HMGB3, HMGN1, and HMGN2, and 0-6 each for HMGA1, HMGA2, HMGB2, and HMGN3. Analysis of the sequences flanking the RPCs revealed that the junction between the target site duplications and the 5'-flanking sequences exhibited the same TT/AAAA consensus found for the L1 endonuclease, supporting an L1-mediated retrotransposition mechanism. Finally, because our algorithm included aligning RPC flanking sequences with the corresponding HMG genomic sequence, we were able to identify transcribed regions of HMG genes that were not part of the published mRNA sequences.PMID: 12727900 [PubMed - indexed for MEDLINE]
PMCID: PMC430908
Thứ Bảy, 10 tháng 10, 2009
Thứ Sáu, 9 tháng 10, 2009
bioinformatic
The research program in the Computational Biology Branch is carried out by Senior Investigators, tenure track Investigators, Staff Scientists, Postdoctoral Fellows, and students. The program focuses on theoretical, analytical and applied approaches to a broad range of fundamental problems in molecular biology. The expertise of the group is concentrated in sequence analysis, protein structure/function analysis, and gene identification, yet research interests cover a wide range of topics in computational biology and information science. Briefly, these include but are not limited to: database searching algorithms, low-complexity sequences, sequence signals, mathematical models of evolution, statistical methods in virology, dynamic behavior of chemical reaction systems, statistical text-retrieval algorithms, protein structure and function prediction, comparative genomics, taxonomic trees, and population genetics. Many of the basic research projects conducted by CBB investigators serve to enhance and strengthen NCBI’s suite of publicly available databases and software application tools. Collaborative research efforts, among NCBI investigators as well as with the external research community, have led to the development of innovative algorithms (BLAST, PSI-BLAST, SEG, VAST, and COGs) and novel research approaches (text neighboring) that have transformed the field of computational biology. Algorithms and applications currently under development have the potential to further advance scientific discovery. Members of the CBB contribute significantly to the validity and reliability of NCBI’s online resources by reviewing the quality and accuracy of the data deposited in the databases, as well as the accuracy of the information used to annotate the data. Members also provide leadership and guidance to the extramural community by planning and organizing scientific consortia to determine the most effective use of public sequence resources for large-scale or high-throughput experimental biology. Researchers collaborate to define known research gaps and to identify mechanisms to bridge these gaps.
Address
Computational Biology BranchNCBI, NLM, NIH8600 Rockville Pike MSC 6075Building 38A, Room 6N601Bethesda, MD 20894-6075U.S.A.Phone: 301-496-2475
Revised: October 1, 2009.
Address
Computational Biology BranchNCBI, NLM, NIH8600 Rockville Pike MSC 6075Building 38A, Room 6N601Bethesda, MD 20894-6075U.S.A.Phone: 301-496-2475
Revised: October 1, 2009.
Đăng ký:
Bài đăng (Atom)