Data Types for Data Sciences

  • Structured versus unstructured data (sometimes called organized vs unorganized)
  • Quantitative and qualitative data
  • The four levels of data

Structured versus unstructured data

  • Structured (organized) data: This is data that can be thought of as observation and characteristics. It is usually organized using a table method (row and columns).
  • Unstructured (unorganized) data: This data exists as a free entity and does not follow any standard organization hierarchy.

Quantitative versus qualitative data

  • Quantitative data: This data can be described using numbers, and basic mathematical procedures, including addition, are possible on the set.
  • Qualitative data: This data can’t be described using the number and basic mathematics. This data is generally thought of as being described using natural categories and language.

The four levels of data

  • The nominal level
  • The ordinal level
  • The interval level
  • The ratio level

The nominal level

The first level of data, the nominal level, (which also sounds like the word name) consists of data that is described purely by name or category. Basic examples include gender, nationality, species, or yeast strain in a beer. They are not described by numbers and are therefore qualitative. The following are some examples:

  • A type of animal is on the nominal level of data. We may also say that if you are a chimpanzee, then you belong to the mammalian class as well.
  • A part of speech is also considered on the nominal level of data. The word she is a pronoun, and it is also a noun.

The ordinal level

The nominal level did not provide us with much flexibility in terms of mathematical operations due to one seemingly unimportant fact we could not order the observations in any natural way. Data in the ordinal level provides us with a rank order, or the means to place one observation before the other; however, it does not provide us with relative differences between observations, meaning that while we may order the observations from first to last, we cannot add or subtract them to get any real meaning.

The interval level

Now we are getting somewhere interesting. At the interval level, we are beginning to look at data that can be expressed through very quantifiable means, and where much more complicated mathematical formulas are allowed. The basic difference between the ordinal level and the interval level is, well, just that difference. Data at the interval level allows meaningful subtraction between data points.

The ratio level

Finally, we will take a look at the ratio level. After moving through three different levels with differing levels of allowed mathematical operations, the ratio level proves to be the strongest of the four. Not only can we define order and difference, but the ratio level also allows us to multiply and divide as well. This might seem like not much to make a fuss over but it changes almost everything about the way we view data at this level.

Some Parting Words

Please don’t feel overwhelmed, we just started in the basic chapter. Though there are a lot of things to learn, after a refresher on this chapter and learning new concepts, you will be empowered to enjoy the hidden study in your daily routine. And that’s a big leap toward becoming an amazing data scientist.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Desi Ratna Ningsih

Desi Ratna Ningsih

Data Science Enthusiast, Remote Worker, Course Trainer, Archery Coach, Psychology and Philosophy Student