window.dataLayer = window.dataLayer || []; function gtag(){dataLayer.push(arguments);} gtag('js', new Date()); gtag('config', 'G-YFZ1F7T6M6'); window.dataLayer = window.dataLayer || []; function gtag(){dataLayer.push(arguments);} gtag('js', new Date()); gtag('config', 'G-YFZ1F7T6M6');

Introduction

At Datahub all our development projects are individual with the client at the centre of all aspects of the development and delivery. We develop our projects using a development framework so there is a systematic approach that provide client visibility and minimises development errors. This ensures that all work is design, developed, tested efficiently leading to quicker client approval.

With any project we want to make sure that it’s developed and integrated into your business quickly and efficiently. For this reason, we use the Data Analytics Lifecycle.

What is the Data Analytics Lifecycle?

Initially designed to deliver big data and data science projects it optimises how data is created, gathered, processed, and analysed to meet business objectives. Datahub have seen the benefits of using this model of development and have adopted it for the delivery of Business Intelligence and reporting projects as well.

The lifecycle consists of a 6-step process form Requirement gathering to delivery.

Project Kick off

Before we start the development steps, we will have an initial project kick off meeting. With most client this can be a remote video call. This call allows for expectation to be set, your organisation can meet the core Datahub staff, and a plan can be agreed.

1. Discovery

All of our projects whether it be a small report delivery project or a large-scale solution, we need to understand the business requirements, also any problems or challenges that your business is facing with the data. This is achieved in a discovery workshop where we get key business stakeholders who understand the business processes.

In the discovery workshop we will want to understand your business, the key performance indicators (KPIs) that you use, and the processes within the business. We would also ask about the data sources, so we understand where the data is coming from. We would also look to map out the data from the source such as a particular database to the final solution. The solution could be a data warehouse, or a machine learning algorithm etc. We will then understand enough information to start work in step 2 (Understanding the Data).

2. Understand the Data

Once that we have access to the data sources then we can start understanding the data in full. These data sources could be a database, some excel files, or a data lake etc.

We then start to look at the data. These checks will vary from project to project but as a rule we will check for:

  • What data is available.
  • The tables and column types that make up the data.
  • If there is any bad data included that we need to address. This could include any blank data in key columns, if any date columns include non-date values, columns where we expect numbers but have text etc.
  • Any business logic that we would need to apply and to what columns.
  • What columns are needed, and which ones can be disregarded.
  • The level of granularity that would be needed. If we need to produce reports that are at day level, then we will need data from the sources at this level.
3. Preparing the Data

Prepare and Process the Data. This would be used for both business intelligence and in machine learning project.

Business Intelligence project
We need to build a process for the Transformation of the data. This is done with an ETL (Extract, Transform, and Load) process. This would be the automated process that transforms and cleanses the data, then to load the data into the final tables.

Machine Leaning
With machine learning we would use python scripts to prepare and wrangle the data so that the data is cleansed and prepared for the machine leaning model to produce the best results.

4. Modelling the Data

Business Intelligence project
At this stage we will build the STAR schema. A relational database is optimised to transactional data for the operations of the business. For example, a retail organisation would include all the sales transactions for each store. Also, could include inventory stock that each store holds. This type of database is optimised for inserting and updating data. This would have many tables in the database.

For reporting we would create a STAR model that would include fact and dimension tables that would have fewer tables and is designed for holding large volumes of data and optimised for reporting.

Machine Leaning
This is the stage where the data scientists design the model. They will look at the data and the requirements to then determine which model(s) they would want to develop. It may be determined that consider two models to see which would produce the best outcome.

After cleansing the data, the team determines the techniques, methods, and workflow for building a model in the next stage. They explore the data, identifying relations between data points to select the key variables, and eventually devises a suitable model.

5. Build the Model

Business Intelligence project
From the OLAP (STAR) model they will start to create a semantic layer that will consist of a tabular model that includes the data from the model plus create any code that will built and apply the business logic. Business logic is any data that will be required for the reports that is not in the base data. This could include calculation based on counts, averages, sums etc. In the steps 1 we would have derived the calculation with the business stakeholders. Then in step 2 we would have checked that the data is available and to a quality for the calculation to be derived.

At this point we would have all the data and model prepared ready for the report building.

Machine Leaning
This is the stage where our data scientists would start to build, train, and test a machine leaning model. This building process will be a cycle to optimise the model until an acceptable output is gained.

6. Present the Data

For any Business intelligence project, we will create the required reports and dashboards.
With any machine learning project, we will present and apply the machine leaning model and integrate it into your systems with automation to allow the model to be used with live data.

Find out how we can help

We do not employ salespeople; our team are all experienced technical specialists that can talk you through any of our services.

Contact us