Loading, please wait...


Aug 10, 2019 BigData, Cloud, Security, 269 Views
In this article, we will we indulging about the concern and privacy of Big Data

Big Data is like a snowball rushing down a mountain gaining speed and volume. We have now far more opportunities to collect Big Data than ever before. There are now billions of devices with internet-capable like smartphones, IoT devices, etc. Now think of all the big data security issues it can generate.

Prioritizing big data security low and set it aside till later stages of the big data project isn't a great move. Unfortunately, many of the tools associated with big data are open source. Most of the times they are not designed with security as a primary function, leading to big data security issues.

Big Data Security Issues

1. Existence of untrusted mappers

Big data goes through no of stages in the map-reduce paradigm. When the data is split into numbers of key-value pairs which mapper uses it to process them and allocates to particular storage. If a non-authorized person has access to the mapper's code, they can manipulate the settings of the existing mappers. In this way, data processing can be effectively ruined. Cybercriminals can make the mappers resulting in the incorrect key-value pairs which will go as an input to the reducers which will produce faulty results.

The main problem is here that getting such access is not that difficult since almost all the big data technologies don't have an extra layer of security to protect data. They just rely on perimeter security system which is not enough to secure the sensitive data.

2. Chances of sensitive information mining

Primary-based security is typically used for big data. It means that placing necessary safeguards at the entrance of a privately owned network to secure it from hackers. But what IT  specialists do inside the system remains unknown.

Lack of security can allow corrupt IT specialists or business rivals mine unprotected data and sell it for their profits. 

Data can be protected more securely by extra perimeters. We can add anonymization for extra security which is a data processing technique that removes or modifies personal information. So, if somebody gets personal data of users with absent names, addresses, and the mobile number they can literally not do anything.

3. Granular Access Control

When it comes to access control, data secrecy is important which means preventing access to data from people who should not have access. Like in medical patient information (name, email id) is kept hidden from the medical researcher as it is not important for them.

But in big data, such access is difficult to grant and control because big data technologies are not designed to do so. A solution for this is to copy the parts of the datasets that a user has the right to see into a separate big data warehouse and provided to the particular user group as a new. 

4. Data Provenance

Data provenance primarily concerns with metadata (data about data), which is extremely helpful where is data stored, who accessed it, or what was done with it. 

Data provenance is a large big data concern. For security purposes, it is important as an unauthorized change in metadata can lead to wrong datasets, which will make it difficult to find needed information.

5. NoSQL Database

NoSQL databases are a popular trend in big data analytics. This popularity is exactly causing the problem.

NoSQL databases are continuously being upgraded with new features. And just like all big data technologies security is being mistreated and left for later steps.

6. Granular auditing

It can help determine when the missed attacks happened, which is very helpful in knowing what should be done to improve matters in the future.

It helps companies gain awareness about their security gaps. Although it is advised to perform it on a regular basis, this expectation is rarely met in reality. This in itself a lot of data. Working with big data has enough concerns and an audit would only add on to it. 


There are lots of big data security issues and they are quite crucial. But it doesn't mean that we should stop doing big data analytics. The thing we can do is to carefully design our big data plan and to put security to the place it should be.