Chapter 35 Reflections on Bioinformatics

thefirsttoassembledatabasesofthesesequencesintoaproteinsequenceatlasinthe1960s,

anizedtheproteinsintofamiliesandsuperfamiliesbasedonthedegreeofsequencesimilarity

tablesthatreflectedthefrequencyofchangesobservedinthesequencesofagroupofcloselyrelatedproteinswerethenderived。 The table reflecting www.biquge.info frequency changes can be derived from observations of a group of structurally similar proteins

The appearance of marked points is conducive to communication and saves resources, is this a hierarchical coupling point?

Chest pain, fatigue, dizziness, headache, edema, back pain, dyspnea, insomnia, abdominal pain, numbness, sexual dysfunction, weight loss, cough, constipation and other 14 symptoms are the most common

Learn to analyze, hypothesize, reason, model

Analyze data, clarify mechanisms, test feedback, and make continuous corrections

The importance of the protein interaction network can be quantified by the number of connections at the nodes, or by its relative amount, where the number of connections is assumed to be some kind of distribution, such as a normal distribution

Gene, protein, phenotype is a coupled system of differential equations, and a gene, a protein, and a phenotype is a special solution. This is an integrated approach to understanding the coupling of biological pathways, networks and systems, hierarchies.

Controlling the expression of all genes in a cell at the same time, like the wave function, and finding eigenvalues for it, can also apply many concepts of quantum mechanics, such as the introduction of operators and commutability. Due to the uncertainty principle, we do not expect to monitor the abundance, localization, modification, and activity of all proteins in cells in real time, but it is approximate to be possible. We think of the cell as a wave packet, each pathway as a wave function (recombinable, different divisions or measurements mean a new wave function), and different networks as eigenvalues, the superposition of which should be linear, but in a large system, the distribution is also a wave function if it is reduced to sufficient accuracy to be observed. We assume that the whole can be built to form a loop-like structure, which is my belief in nature out of topology.

Simple superposition --- linear superposition (hierarchical coupling, internalization and externalization)

Stomach pain, inability to stand and walk independently (problems with muscle development, primary information extraction), normal before the age of three and a half years (further validation of development), thick calves, thin legs, absent knee reflex but normal ankle reflexes (neurodevelopmental or otherwise), abnormal heart sounds, abnormal mental health--- blood tests and muscle biopsy: (macroscopic reflection of cell number, deviation from normal values at the molecular level in addition to individual quantity reflections, a matrix can also be constructed to construct a network, and the hidden Markov model suggests the possibility of probabilistic construction of specific combinations。 ) normal red blood cells, high leukocytes, high glucose, very high sarcosine kinase, normal glucose-hexaphosphate dehydrogenase, normal normal tissue fibers, partial nuclei, patients with internal nuclei, uneven fiber thickness, and dispersion of fat and connective tissue. ---- family history inquiry, the possibility of genetic diseases, dominant recessiveness, monopolygenic diseases------ chromosomal karyotype do restriction fragment polymorphism analysis, determine the dominant recessive ---- phenotypic differences, chromosomal recombination pattern (the probability of detecting abnormalities from a certain cell): the role of chromosomes is an expression of the probability space, and chromosomal recombination modifies the probability space to make it inclined to express one of the X chromosomes

------- DNA sequence analysis

Final diagnosis: pseudohypertrophic muscular dystrophy

Further research is warranted as muscular dystrophy can be associated with nerve growth and ultimately brain damage due to various levels of coupling.

Seek model animals, observe various changes in disease development, and can exert different influences to observe their feedback, such as the production of antibodies to dystrophin to immunolabel the skeletal muscle of wild-type and MDX germline mice------ found that the protein is located on the inner surface of the skeletal muscle cell membrane, and also in neurons in the brain (which explains both the damage to the nervous system and predicts the expression of the relationship between genes and proteins, the spatiotemporal expression of dystrophin protein); , rather than protein overexpression or hyperactivity in dominant disorders.

When sequencing proteins, a lot of relevant information is found, such as the related protein utrophin, and the consistency and similarity of its sequences are compared to determine their homology--- and the similarity of protein functions is predicted based on the similarity

Sequence-structure-function shifts to gene expression, system dynamics, and physiological function

Interacting proteins are those that are involved in the same metabolicpathway or biologicalprocess, and are part of the same structural complex or molecularmachine. They can have "physical" contact with each other or not, but are only genetically related

Interactions: (at multiple levels of coupling, metabolic biological pathways, structure, gene level) but also with RNA, physiological level

It is constructed into a network with certain topological properties, such as a chemical equilibrium to reach a steady state

Methods for studying protein-protein interactions: yeasttwo-hybrid system, mass spectrometry and protein chips, computer simulations

PhylogeicProfile: Functionally-related groups

Because in a set of fully sequenced genomes is expected to be present or absent at the same time, this exists

A pattern that is present or absent is called a phylogenetic profile; if two genes,

Their sequences do not have homology, but their phylogenetic profiles are consistent or similar

It is inferred that they are functionally related

Gene neighborhood: In the bacterial genome, functionally related genes are closely linked in a specific region to form an operon, and the adjacency between genes is conserved in the process of species evolution and can be used as an indicator of the functional relationship between gene products. This approach seems to be only applicable to structurally simple microorganisms in the early stages of evolution.

Gene fusion event: This method is based on the assumption that two (or more) interacting proteins of one species fuse into a polypeptide chain in another species, and thus the gene fusion event can be used as an indication of protein function or interaction due to a gene fusion event that occurs during the evolution of a species.

The idea of this method is that functionally related proteins or domains of the same protein should be functionally constrained, and their evolutionary processes should be consistent, i.e., co-evolution characteristics111 are presented, and by constructing and comparing their phylogenetic trees, if the topology of the tree is found to show similarity, such similar trees are called mirror trees, then it can be inferred that the functions of the tree-building genes are related

Correlatedmutation: Proteins that are in physical contact with each other, such as proteins in the same structural complex, where the accumulated residue changes in one protein during the evolution are compensated by corresponding changes in the other protein, this phenomenon is called correlatedmutation, which can be seen as a mirror tree at the amino acid level, and the molecular mechanism of correlational mutation is supposed to counteract the continuous mutational drift due to genes ( constantmutationaldrift) to maintain the stability of the structural complex and maintain its function

Sequence Signal Associations (correlatedsequence-signatures) use sequence domain signal association as an indicator of the recognition of interacting proteins.

It is possible to predict the interaction of unknown functional proteins with known proteins, reducing direct experimentation

of the search space

Conserved interologsProteins that interact with each other are conserved during species evolution, so protein-protein interactions in other species can be predicted through the protein-protein interaction network established in one species.

Homologous structural complexes (homologous structural uralplexes) envision protein complexes with known three-dimensional structures, each of which interacts with members of the same family in the same way

Correlatedevolutionary-rateThe evolutionary rate of a protein is determined by the number of interactions between the protein and other proteins, and is negatively correlated, that is, the higher the number of interactions, the lower the evolutionary rate

The time limit for concentration is 10-18min

Case teaching, discussion, resonance

Art, the elimination of redundancy, that is, the elimination of uncertainty, and the rest is useful information

Participate in the learning process and take the initiative

will be integrated into life

Creativity and awareness

The exchange is a jump, each time the sorting is set a datum point, the number less than or equal to the datum point is all placed on the left side of the datum point, and the number greater than or equal to the datum point is all placed on the right side of the datum point.

DNA-RNA-protein is a space where different levels of coupling are combined

Semi-retained, bidirectional, semi-discontinuous is a phenomenon whose deep meaning may be an analog of the physical world, i.e., quantization

Semi-retention is a pattern, neither full retention nor mixture, which is the statistics of the whole, and semi-discontinuity is the manifestation of its individualization

Bi-directional is the space where the possibilities are built, so that the benefits of speed are maximized

The replication starting point is discrete and is a simplified replication, i.e., it is seen as a sublayer

Catalysis, Metabolic Regulation, Immune Protection, Transport and Storage, Movement and Support, Message Transfer, Oxidative Energy Supply

The proportions of the number of coupling connections reflect the nature

Elemental composition: different components, proportions, distributions, combinations, one-dimensional sequences and two-dimensional domains and three-dimensional spaces

The central law of DNA is that double-stranded DNA is a closed source of information, and single-stranded RNA is a kind of break, which allows the information to be expressed, because the complementary double-strands are all right, i.e., undecidable. Next is the translation of proteins, which embodies a kind of degeneracy

1% is exons and 99% is introns and repeats, which is a coupling and should reflect a certain distribution

Paralogous, self-homology--- different levels of optimal paths, pattern identification, characteristic signals, epigenetic expression and another level of heredity

Phenotype = genotype + environment + genotype * environment

Logical non-canonical is like the relationship between classical heredity and epigenetics

Modifications result in different structures and outcomes, and are another level of translation, to amino acid residue sites or other modification sites

High-throughput, with peak detection, different levels of detection make different sensitivities react, and then build a network at the overall level, Bayesian network helps to build a dynamic network, so that the appearance of relationships is predictable

Can algorithms, fortune telling, provide some enlightenment?

Algorithm is king, which is the structure of multi-level competition in the network, according to different external influencing factors and different internal biases, such as a one-time allele, and finally expresses a certain structure. We can't have the best of both worlds, we have to make trade-offs, and this choice is our final path to network collapse. Sorting algorithm, delineating a certain important sequence, which is a certain intrinsic; greedy algorithm, in each step of the selection, take the best or optimal (i.e., the most favorable) choice in the current state, even if the whole is not optimal, but close; dynamic programming is a kind of traversal, can find the overall optimum, but the speed is slow; the shortest path, the energy is minimized, convergence becomes a certain relatively independent level; the above algorithms are both good and bad, that is, the algorithm that makes the choice must also make a choice, to make a trade-off, the local optimal is a better choice made according to the information in front of you, even if it is not the overall optimal, but it is also the diffusion of the intrinsic features, which is a high probabilityNaturally, there are certain sequences that can reach the overall optimum, but their probability is very low.

The above is for the individual, and the next is for the whole. Since the competition of the individual can bring a certain vitality to the whole, there is no need to focus on the individual for the time being, because the collapse of the wave function can always construct the path of the beginning and the end, and we have this confidence. Therefore, it is necessary to consider the construction of the overall environment, such as the formulation of certain laws as the most basic guarantee of the economy and society, and then the free development of the individual, through the infinite combination of molecular thermal motion, there is always some meaningful path. The choice of hierarchy and the method of selection are the result of selective expression, which is an adaptation to the current environment. The distribution of nodes in the population conforms to the power-law formula and has a certain degree of connectivity

Finally, there is the combination of reductionism and holism, that is, the integration into the structure of higher dimensions. Due to the limitations of the individual's vision and the limitation of information flow, according to the operation of the Bayesian formula, the importance of the individual is infinite for the individual, just like the selflessness and no world of the mind, so there is such a coupling relationship: the collapse of the overall network constructs the individual, and the infinite diffusion of the individual forms the network. It is all the result of competition: between individuals, between individuals and the whole, between the whole and the whole. This coupled structure has the properties of a wave function, it is always functioning, and we can only observe a part of its overall process (called collapse in physics), i.e., selective expression. It is highly adaptable to the environment

Similarity, homology, due to the similarity of sequences - structural similarity - functional similarity

Vacancy penalties are a means of matching, which is locally optimal

Scoring matrix, alternative matrix

seeds, seeds, the intrinses of the pattern

;