Examples of raw data – for analysis and research

Data goes through different forms before it gets converted into useful insights. We take a look at examples of raw data. This will help us appreciate the underpinning raw material of analytics. It will also help us in understanding different types of data and how they have to be worked upon. This is the first step in using data analysis for decision-making.

The raw data 

Raw data is a collection of facts, figures, and other information that is available for analysis. This data is called raw because it has not been worked upon. Overall, there are three stages of data in analytics: raw data, processed data, and analyzed data.

Examples of raw data

Data collection 

The first step in data analytics is the collection of data. The data that we collect is typically raw data. We can have different sources of raw data. For example, we can collect data from primary sources like interviews or surveys. However, we can also have secondary sources of data like government, statistics, and some publicly available data. 

In certain cases, we may be interested in analyzing the data that is present in a database. In such cases, we have to do the collection of data by running SQL queries. This data is also an example of raw data. 

Why is raw data important? 

Raw data is important because it is the first step in the journey of data from potentially useful to a useful form. A lot of attention has to be given before the data is collected. Once the data has been collected, we are limited by the quality of data that has been collected. If the data collection is not done properly, then we are stuck with less-than-optimal raw data. This could lead to less-than-optimal results at the end of the analysis process. 

Secondly, raw data is also important because it can offer some insights on its own. There are times when analysts can find patterns in the raw data itself. One of the ways in which pattern can be found in raw data is by running specific queries in SQL that gives a summary result of the database. However, more commonly raw data can be analyzed in Excel itself. Sometimes we are able to make sense of the data by looking at the tables and their values in Excel. 

Thirdly, raw data is also important as a trend-setter for later stages of analysis. Care should be taken at this stage to represent and record the data effectively. Correct data entry will help the process later on by eliminating the possibility of errors. Some of the errors of data visualization occur during the raw data management stage only.

What are the types of raw data? 

There are many different types of raw data. However, for our purposes, we shall define raw data in terms of the data collection procedure. Broadly, we can say that there are two types: qualitative data and quantitative data. 

Qualitative Data Quantitative Data 
Interview recordings Tables extracted from databases
Transcripts Survey responses captured via scales 
Videos Employee performance data 

Examples of Raw Data 

Survey of weights of students in a classroom 

In this example, we look at the distribution of the weight of students in a classroom. In this example, the weights represent the raw data. This data can be captured or arranged to produce a meaningful output. For example, we can find the average of all the weights to find the average weight of a student in the classroom. We can also find the maximum weight of a student and the minimum weight of a student from this table.

Student number Weight 
61 
79 
90 
53 
79 
102 
45 
67 
105 
10 73 
11 72 
12 85 
13 81 
14 77 
15 98 

 

An example of Sales Data

When we collect data from a small sample based upon a small number of variables, that data is typically small data. Although there is a lot of impetus in recent years on big data, even small data could be quite useful for analysis. It is much faster to collect this type of data and it is much easier to process this data. A smaller data set can also be processed manually by the analyst. The table below is one of the simplest examples of raw data. 

SKU Item name Brand Price Sales 
300 Writex 100 Benolys $2.50 100 
301 Writex 200 Benolys $3.50 80 
302 Phantom ball Benolys $3.00 35 

Raw audio data from interview recordings 

Archived data formats Contemporary formats 
Reel to Reel tapes Wav  
 
Resolution: (usually 16 or 24bit and sampled at rates between 44.1khz to 192khz) 
 
Bitrate: 1411kbps for CD quality resolution (16bit 44.1Khz) 
Audio Cassettes FLAC  
 
Resolution: (usually 16 or 24bit and sampled at rates between 44.1khz to 192khz)  
 
Bitrate: 500 to 2500kbps 
Digital or digitized recordings Mp3  
 
Resolution: (most commonly 16bit 44.1khz)  
 
Bitrate: Usual bitrate may be between 64kbps to 320kbps 
Other formats like microtapes, etc Other formats like Ogg, AMR, etc are more suited for smaller file sizes 

 
 
One of the examples of raw data is the audio files that we get from recording interviews. Please check an example of a recorded audio interview below. This reference file is taken from the US Library of Congress.
 

Source US Library of congress 

Raw video data 

Another example of raw qualitative data is raw video data. Video data can also be available in different formats and sizes. Usually, there’s no data that is captured from a camera is high bitrate. It needs a lot of space for archival. However, raw data captured from smaller or consumer-grade video recorders and smartphones may already be compressed.  

Archived data formats Contemporary video formats 
Film reels 
 
8mm, 16mm and 35mm are the most common sizes. 
Lossless digital video formats  
 
Common formats are Apple Prores, DNxHD, and GoPro Cineform. 
 
These files have large file sizes and are editing-friendly. Good for archival. 
Video Tapes RAW video 
 
Not to be confused with the raw captured video. Raw is also a method of capturing video with more flexibility of editing later.  
 
However, it is not so useful for research purposes. It takes a lot of space. It is more suited if the video needs to be edited and released.  
A digitized version of analog video Compressed video 
 
Common formats are H.264 and H.265 (HEVC). 
 
If the video needs to be manually analyzed or transcoded then this is the most suitable format. Small file sizes and easy to handle, store and playback. 

Social Media data as raw data

Social media analysis is another important way to understand market patterns and get some user insights. However, it is also more challenging due to the following aspects:

  1. Social Media Data is an example of raw data that is unstructured. Unstructured data is difficult to analyze.
  2. It is also difficult to capture social media data except for platforms like Twitter and Youtube.
  3. A lot of social media data may not be useful. Therefore it takes more effort to find useful patterns.

What are the common pitfalls with Raw Data? 

  1. More data is not always better 
  1. Starting with raw data rather than with hypothesis 
  1. Trying to find answers in raw data 
  1. Not handling raw data properly 

Sharing is caring!

Leave a Comment