Chapter 94 Preparation of Probability Knowledge for Network Theory 1
1. In the future, we hope to be able to combine related branches of mathematics such as linear algebra and calculus to better understand random processes and pattern recognition. Because the modeling we are doing requires a good understanding of these ideas in order to have good algorithms, such as dynamic programming and then the BLAST algorithm for sequence matching, which is our goal, from formulas and theorems to specific applications. The axiomatic precondition of probability theory, measure theory is a basic premise and common cognition, after all, mathematics must be rigorous. Probability is a set of measures based on a measure, just as weights and measures need to be defined in advance before they can be described. With the help of probability theory, we can grasp the whole by grasping a small part of the information, after all, the motion of the whole is very complex, it is difficult and unnecessary for us to grasp all of them, we only need to grasp the fixed point in it is enough to refer to the whole with parts. This is somewhat similar to the Fourier transform of signals and systems, which transforms complex time-domain changes into a simple and straightforward frequency domain. This is a kind of ascending perspective, we have no way to grasp the specific changes, which requires too much computation, but we can understand the underlying changes by grasping some variables in the higher dimensions. Statistics and probability are one.
The nature of probability is explained in two ways, one is frequency (the frequency derived from past statistics) and the other is probability (a measure of the probability of future occurrence). If we consider that there is no difference in time, we can consider it equivalent. According to the combination of different events, the probability of their occurrence will eventually form a certain probability distribution. By grasping this information, we can grasp the relative proportion of different events, which can provide good guidance for our choices. After all, the addition of new information always reduces uncertainty and makes our decisions a little better statistically than random decisions, which adds up to today's complex, highly ordered world.
The use of set theory is essentially a description (abstract formulation) of various relationships, while probability theory quantifies these relations. We build a system to model, and there are so many variables that can affect the operation of the system that it is impossible to grasp them all. However, through the derivation of theories and the accumulation of experience, we can still make more optimal choices.
Using the concept of functions, there is a probability function (essentially a mapping, from an event to a probability value), an independent variable is a specific event (the essence is a set, so the properties of the correlation need to be considered: full set, empty set, complement set, intersection), and the dependent variable is the probability of the event occurring (range 0-1). De Morgan's Law is a good example. The complement of A union B is equal to the complement of A's complement intersection B, which is easily observed to hold on a Venn diagram. And the establishment of this equivalence relationship is commonly used in mathematical logic, that is, to find the truth value table. Generally speaking, the method of proof is to use the exclusion/counter-proof method.
The sample space, which is the set of all possible outcomes of a probabilistic experiment (which can be thought of as a set of sets). This is the result of statistics, and the specific event is fixed and exclusive, and it is difficult for us to grasp, but we can grasp the relative proportion of the statistical level.
The system is uncertain, and we need to grasp the quantity that is deterministic, that is, the probability is a set of refinements of the whole. Therefore, building a probabilistic model can help us better understand complex systems, after all, the proportion of random events occurring, i.e., the probability is already known.
2 The axiom system of probability: 1 The probability of any event must be greater than or equal to zero 2 The sum of the probabilities of events in the sample space is one 3 The probability of the union of mutually exclusive events is equal to the sum of their respective probabilities
The system built from the underlying assumptions, such as the axiom system of Euclidean geometry, is to derive various complex geometric relations from a limited number of five axioms, and change the Riemannian geometry and Lobachevsky geometry formed by the fifth axiom. Different hypotheses can lead to different axiom systems, which can lead to more complex theorems and their properties. This is the basis of the edifice, and as long as the axioms are true in a particular field, then the subsequent theorems and properties are self-evident and can be used directly.
In the same way, a sufficiently complex event can be broken down to the most basic event that satisfies the axioms. This reductionist approach has played a huge role in calculus, and programming languages need sequential, branching, and looping statements to represent all kinds of complex logic. C.Bohm&G.Jacopini,“FlowDiagrams,TuringMachinesandLanguageswithOnlyTwoFormationRules,“CommunicationsoftheACM,vol9(5)May1966,pp366-371。 Thus, in essence, the axiomatic system is an asymlogy. The formulation of various theorems is a meaningful linear combination.
Of course, this combination is exponentially exponential, and extracting meaningful theorems from it is like looking for a needle in the sea, but it is feasible and can also be seen as a kind of emergence. Refer to the various properties derived from it: the probability of an empty set occurring is 0, and the probability of any event A occurring will be equal to the probability of a complement occurring by 1 minus A.
Probability is a measure of how well information is mastered. Conditional probability (P(X|.) Y)=P(XY)/P(Y)) is the update of the probability of a specific event, which is essentially the update of the sample space, so that the probability of the original event changes. By transforming the event, the probability of the occurrence of different combinations of events can be obtained.
Nature 1 P(X|Y) the conditional probability must be greater than or equal to 0, and property 2 is P(Y|Y) will be equal to 1; property three: if A and B are mutually exclusive, the probability of A and B being equal to the sum of their respective conditional probabilities in the case that Y has already occurred; similar to the axiomatic system of probability, it is actually its property that extends to conditional probability.
TotalProbability theorem, the formula for full probability (P(A)=∑P(A|Ci)*P(Ci)), for any event A we have P(A)=P(A|C1)P(C1)+P(A|C2)P(C2)+...+P(A|CN)P(CN)。 It's a decomposition. THEN THE INVERSE OPERATION IS BAYES' THEOREM, BAYES' RULE, WHICH CONSIDERS THE PROBABILITY OF OTHER EVENTS OCCURRING UNDER THE PREMISE THAT A PARTICULAR EVENT OCCURS, I.E., P(Cj|). A)=P(CjA)/P(A)=P(A|Cj)*P(Cj)/∑P(A|Ci)*P(Ci)。 The transformation of these events can form complex relationships that correspond to specific events that occur in reality.
3. Probability independence: The probability that the two events A and B occur at the same time is equal to the multiplication of the probability of their individual occurrences, then A and B are independent events, that is, they are not affected by each other. Or, in the case of one event, the probability that another event will occur is equal to the probability that the event did not occur.
Weinberg's genetic equilibrium is based on the relationship between the gene frequencies of genes A and a and the frequencies of the genotypes formed by them (AA, Aa, aa): P(AA):P(Aa):P(aa)=P(A)^2:2P(A)P(a):P(a)^2.
Probability calculation: 1 is disassembled into a simple basic event, and the probabilities are multiplied and then added. 2 combinatorial permutation 3 experimental method, Monte Carlo algorithm is based on this principle, can simplify complex probability operations into large-scale experimental statistics, that is, frequency = probability.
A binomial theorem, the coefficients of which correspond to the number of possible occurrences.
4 Random variables are essentially functions, and there are continuous and discrete random variables, the former being a finite or countable infinite number of variables, and the latter being an uncountable infinite number of variables. This is the conclusion reached through the constructed method, and it is always possible to find a number that cannot be counted by a particular algorithm.
Comparing the size of the set is actually carried out through correspondence, such as odd and even sets and integer sets are equipotential, for each element of the set can find the corresponding elements of other sets, that is, one-to-one mapping, such as a line segment is as many as a plane points.
The cumulative distribution function CDF is used to calculate the probability that the value of a random variable will fall within a certain range: the CDF of a discrete random variable is the sum of the possibilities, and the CDF of a continuous random variable is the area of the function in a specific interval
Probability Mass Function PMF (only for discrete random variables)
Probability distributions: Bernoulli probability distributions;
5
Discrete probability distributions
Continuous probability distributions
Geometric probability distributions
Poisson distribution (-λT power of e multiplied by λT x power of x step multiplied part), approximation of binomial distribution (n multiplied by x x power of p multiplied by n-x power of 1-p)
The normal distribution is a further approximation.
The probability density function (from PDF to CDF is integral, from CDF to PDF is differential), similar to the extraction of high-dimensional quantities