A blueprint for connected data

You can find the first part of the blog about connected data here.

Go Connected

To facilitate the implementation, we present our system blueprint for leveraging the connected data world. This blueprint describes an extension of the typical data ingestion, storage and processing landscape towards a more flexible usage of all data sources across the board within the enterprise with the promise of transformative outcomes.

The blueprint shows 3 areas of development which are foundational to the connected data world. Enterprises will have to develop skills and implement supportive systems for data connectivity, people connectivity and data intelligence. As connected data will become more and more critical for businesses and integrate deeply into people’s lives, those systems will gradually emerge and evolve into mission-critical resources.

The path to connected data is full of technological and organisational challenges. We can also see new kinds of challenges which will need to be openly tackled. We list here some of them:

  1. legal requirements: depending on the data type, country, industry and use case and in particular in case of personal data. For instance, to work with credit card data in the United States, you need to to be compliant to the PCI DSS standard document with around 140 pages of detailed border conditions on processing and storage [1].
  2. ethical requirements [2]: even if still not well defined for all data handling steps, the ethical factor prescripts not to do anything evil with data. For instance in the European Union, where the new stringent General Data Protection Regulation is due to come into force in May 2018, companies should provide both data and algorithms in case of discrimination suspicion.
  3. technological and algorithmic continuity: as a deep usage of data and algorithms matures, there is a growing motivation to rethink technology and mathematical boundary conditions from the sustainability point of view. Is the selected technology still available in 5 to 10 years? And how easy it would be to migrate it to the state-of-the-art?
  4. trust: it affects the physical data ownership, in particular while saving data in external facilities, i.e. not your own IT department. A common trend has developed in the last decade to save data in the cloud [3]. In that case enterprises need to rely on partners who guarantee physical and virtual data access and security. Very often your data land at „ the big four“ [4]– Amazon, Microsoft, IBM and Google with the Amazon being the biggest cloud provider, with about a third of the market share and more than 35 data centres throughout the world. Putting your data to one of them it is often very difficult to understand where your data is stored. The data of your Gmail account is for instance absolutely on more than one server and in more than one country [4:1]. Being an enduser, you sometimes even do not know who is in fact the latest instance owning your physical data — as in the case of Apple cloud. Recently Apple finally confirmed [5] that it uses Google’s cloud for iCloud relying on Google’s public cloud for data storage. Might it mean for us that we trust Apple less for now? Probably more a rhetorical question.

Outlook

Data gives a new power to a modern decision making and the provisioning of intelligent products and services. According to recent studies [6], by 2020, 90% of large enterprises will generate revenues from data-as-a-service with large penetrance of artificial intelligence for many digital transformation initiatives. Data technology experts will be indispensable to build an understanding on how to apply technology to the collected data. By leading a dialog with business departments, they would be able to create significant additional business value out of the data. Now the time is ready to influence and leverage developments from the data field. It is a massive opportunity for businesses to induce positive changes on all facets of human society. Let’s begin!

– PCI Security Standard Council. Library. Source: https://www.pcisecuritystandards.org/document_library?category=pcidss&document=pci_dss, visited on 11.04.2018 ↩︎

– Rainer Graefen. Über Datenethik bei der Datenanalyse wird noch zu wenig debattiert. Source: https://www.storage-insider.de/ueber-datenethik-bei-der-datenanalyse-wird-noch-zu-wenig-debattiert-a-689022/?cmp=sm-li-swyn&utm_source=linkedin&utm_medium=sm&utm_campaign=linkedin-swyn&lipi=urn%3Ali%3Apage%3Ad_flagship3_feed%3BN6R67zsHSNOAggKER9Frbw%3D%3D&licu=urn%3Ali%3Acontrol%3Ad_flagship3_feed-object, visited on 11.04.2018 ↩︎

– Paula Doe. Huge Growth in Cloud Changes Semiconductor Supply Chain. Source: http://www.semi.org/en/huge-growth-cloud-means-changes-across-semiconductor-supply-chain, visited on 17.04.2018 ↩︎

– Rob Crossley. Where in the world is my data and how secure is it? Source: http://www.bbc.com/news/business-36854292, visited on 11.04.2018 ↩︎ ↩︎

– Jordan Novet. Apple confirms it uses Google’s cloud for iCloud. Source: http://www.xing-news.com/reader/news/articles/1225407?cce=em5e0cbb4d.%3AAA1pFumY6a8-qZ1lk4Y-qaAR&link_position=digest&newsletter_id=31299&toolbar=true&xng_share_origin=email, visited on 17.04.2018 ↩︎

– IDC. IDC Predictions Provide a Blueprint and Key Building Blocks for Becoming a Digital Native Enterprise. Source: https://www.idc.com/getdoc.jsp?containerId=prUS43185317, visited on 17.04.2018 ↩︎