Friday, May 24, 2024

Probability and Statistical Operation Using Python

 STATISTICS AND POBABILITY 

STATISTICS:
The process of gathering information, tabulating it, and interpreting it numerically is known as statistics in general. This branch of applied mathematics deals with the gathering, analysing, interpreting, and presenting of data. We can see how data can be utilised to tackle complicated problems with statistics.

This tutorial will teach us how to use Python to solve statistical issues and explain the underlying theory. To begin with, let us grasp a few ideas that will be helpful in this post.

Statistics


Descriptive statistics, in general, refer to the description of the data using certain representative techniques, such as tables, charts, Excel files, etc. The way the data is presented allows it to convey some important information that may be utilised to identify potential trends in the future. Univariate analysis is the process of describing and summarising a single variable. Bivariate analysis is the process of describing a statistical relationship between two variables. Multivariate analysis is the process of describing the statistical relationship between several variables.

There are two types of Descriptive Statistics:

  • The measure of central tendency
  • Measure of variability
The measure of central tendency:
  • Mean: It is the sum of observations divided by the total number of observations. It is also defined as average which is the sum divided by count. 
  • Median: It is the middle value of the data set. It splits the data into two halves. If the number of elements in the data set is odd then the centre element is the median and if it is even then the median would be the average of two central elements. it first sorts the data i=and then performs the median operation
  • Mode: It is the value that has the highest frequency in the given data set. The data set may have no mode if the frequency of all data points is the same. 
Measure of Variability:
  • Range: The difference between the largest and smallest data point in our data set is known as the range. The range is directly proportional to the spread of data which means the bigger the range, the more the spread of data and vice versa.
  • Variance: It is defined as an average squared deviation from the mean. It is calculated by finding the difference between every data point and the average which is also known as the mean, squaring them, adding all of them, and then dividing by the number of data points present in our data set.
  • Standard deviation: It is defined as the square root of the variance. It is calculated by finding the Mean, then subtracting each number from the Mean which is also known as the average, and squaring the result. Adding all the values and then dividing by the no of terms followed by the square root.
PROBABILTY DISTRIBUTION:
probability Distribution represents the predicted outcomes of various values for a given data. Probability distributions occur in a variety of forms and sizes, each with its own set of characteristics such as mean, median, mode, skewness, standard deviation, kurtosis, etc. Probability distributions are of various types let’s demonstrate how to find them in this article.
PD



There are three types of Probability Distribution:
  • Normal: The normal distribution is a symmetric probability distribution centered on the mean, indicating that data around the mean occur more frequently than data far from it. the normal distribution is also called Gaussian distribution. The normal distribution curve resembles a bell curve. 
  • Binomial: if an experiment is successful or a failure. if the answer for a question is “yes” or “no” etc. np.random.binomial() is used to generate binomial data. n refers to a number of trails and prefers the probability of each trail. 
  • Poisson's: A Poisson distribution is a kind of probability distribution used in statistics to illustrate how many times an event is expected to happen over a certain amount of time. It’s also called count distribution. np.random.poisson.function() is used to create data for poisson distribution.
As, I previously said during the journey I will take readers through hand on journey. I am providing the link of folder which is freely accessible where I have posted various documents in which i have implemented the model on easiest level. Any beginner can easily understand the models.
Those models are implemented in "Jupyter notebook" which is the platform for implementing python projects. 


Kindly, refer the link provided below:


With the end of this blog we covered all the basics of Machine Learning. I hope i delivered the enough amount of knowledge for basics of ML and my readers would be happy and satisfied.


THANK YOU ALL FOR SUPPORTING ME AND HOPING THE SAME GOOD RESPONSE FROM YOU GUYS ....!


2 comments:

Probability and Statistical Operation Using Python

 STATISTICS AND POBABILITY  STATISTICS: The process of gathering information, tabulating it, and interpreting it numerically is known as sta...