Basics of Medical statistics

Statistics
Statistics is defined as a process by which numerical data are transformed into a usable form for scientific interpretation.

This process can be entails in two ways:
 * 1) Descriptive statistics-manipulating data to summaries the findings.
 * 2) Inferential statistics- develop general conclusions from the data

Medical Statistics
The basic application of mathematical statistics in the medical field for the following reasons: Basic requirement for medical research, Updated medical knowledge, Data management and treatment.

Data
The first step in the process of statistics is to identify what types of data you are dealing with, in order to get the most useful information from a mass of figures.

There are two main types of data, Both can be further divided into two subgroups:
 * 1) Qualitative Data: nominal and ordinal.
 * 2) Quantitative Data: discrete and continuous.

Qualitative Data
it is description of quality of something for example: value, appearance, taste. They either cannot be measured because what they represent can only be approximated. It is when observations fall into categories.

There are two sub-types of Qualitative Data:
 * 1) Nominal: Data are classified using numbers that are randomly assigned to particular categories. For example, giving nationalities particular numbers such as 1 for Iceland, 2 for England, 3 for Wales would produce nominal data. Dichotomous (or binary) data are a special type of nominal data where there are only two categories. Examples of this include marking male subjects 1 and females 2; another is dividing the groups according to the presence or absence of a particular disease.
 * 2) Ordinal: In contrast with nominal data, ordinal data deal with categories that can be organized in some logical sequence known as “rank order”. The categories themselves could be either numeric (for example, the Glasgow coma score) or non- numeric.

Quantitative Data
It is information that can be measured and presented as numbers. Examples include height, weight.

There are two sub-types of Quantitative Data:
 * 1) Discrete:  it can only take certain values. Can only be divided into discrete values i.e. whole numbers. For example; The number of compliments your department receives per week or the weekly number of cardiac arrests represent types of quantitative-discrete data.
 * 2) Continuous: When there are more than the arbitrary figure of 20 possible values, and invariably associated with units of measurement, the data are considered to be quantitative-continuous. These are similar to discrete data in that the difference between consecutive numbers is equal. However, they do not need to be whole numbers, instead they can be any value within a particular range. Everyday examples include SaO2, heart rate, blood pressure and weight.

Summarizing Data
Summarizing data is important because it allows the information to be easily and quickly interpreted. It can be done graphically or in a tabular format- depending upon the type of presentation.

Tubular Summary
These are commonly used to summaries nominal data but they can be applied to ordinal and quantitative varieties as well. The number within a particular category is called the frequency. Consequently, a frequency table lists the various numbers within different categories.

Graphical Summary
There are several ways of graphical summarizing of information (for example: Line chart, Bar chart, Pie chart) the choice depends upon the type of data you are dealing with.
 * histogram: a graphical representation of the data. The data in the histogram are shown as rectangles representing different categories, there is no overlap between them, and they shows according there relative prevalence. ​​each rectangle represents the corresponding relative frequency. Typically, the horizontal axis (X axis) represents the categories of data(intervals). perpendicular Axis (y axis) depicts the frequency. Height of the rectangle, expresses the frequency or density of cases, per one unit of the suspect. The same information can also be displayed in bar chart. but the graphical show enables fast, intuitive perception of the data. The bins (intervals) must be adjacent, and are often (but are not required to be) of equal size. The words used to describe the patterns in a histogram are: "symmetric", "skewed left" or "right", "unimodal", "bimodal" or “multimodal".
 * Frequency distribution- is an organized tabulation/graphical presentation of the number of individuals in each category on the scale of measurement. It allows the researcher to have a look at the entire data. It shows whether the observations are high or low and also their concentrations,i.e. if they are concentrated in one place or they spread out. Thus, frequency distribution presents a picture of how the individual observations are distributed in the measurement scale.The Normal Distribution.svg
 * Normal distribution- bell-shaped frequency distribution curve. This curve, which is sometimes called a “Gaussian distribution”, is rightly regarded as the most important in the discipline of statistics. it has the characteristics of a single peak with an even distribution of values on either side. mean median and mode will be equal. The further a data point is from the mean, the less likely it is to occur. It is normal in the sense that it often provides an excellent model for the observed frequency distribution for many naturally occurring events. The vast majority of distributions in medical statistics have only one peak (that is, they are unimodal) for example in wight, height, and blood pressure.

Central tendency
The measures of central tendency refer to a single value which determine a central or typical value in a set of data for a particular parameter. A central tendency can be estimated in following ways: 1. If all the numbers in the list are the same, then this number should be used 2. If the numbers are not the same, the average is calculated by combining the numbers from the list in a specific way and computing a single number as being the average of the list.

Several types of averages are used according to what kind of data are represented by the numbers : Means (averages) Median and Mode.
 * 1) Mean: The best estimate of the measured quantity X from n measurements. Calculated as the sum of all measured values and then divided by the number of the measurements. The mean cannot be used for ordinal data because it relies on the assumption that the gaps between the categories are the same.
 * 2) Median - Divides an ordered sample into two equally sized parts (with the same probability 0.50) The numbers are arranged in either descending or ascending order and the middle number is taken.
 * 3) Mode- The most frequent value in the sample. represent the most popular option and the highest bar in histogram.

Measures Of Variability
Summary measures are used to describe a dispersion of values within a distribution; the spread of values in data. It allow us to summaries our data set with a single value. it eventually shows us how much observations in a data set vary. By the Use of central tendency and variability we will get a more accurate picture of the data set.

The 3 main measures of variability: Range, interquartile range and Standard Deviation.
 * 1) Range : The numerical distance between the largest (X maximum) and smallest values (X minimum), it tells us about the variation in scores we have in our data, or it tells us the width of our data set.
 * 2) interquartile range (IQR): is a measure of variability, based on dividing a data set into quartiles. Quartiles divide a rank-ordered data set into four equal parts. The values that divide each part are called the first, second, and third quartiles; and they are denoted by Q1, Q2, Q3 and Q4, respectively.
 * 3) Standard Deviation:  provides us with a numerically meaningful measure of variance. The average distance each observation is from the mean.This value (when combined with other statistic methods) allow us to infer what percentage of our observations are a certain distance from the mean.