Why did you decide to make a career in data governance and now specifically data cataloging?
My professional parcours have been made based on a lot of opportunities and crossroads. You don’t have to make the choice between IT and business. The role of facilitator also holds a lot of opportunities.
I started my professional life about 30 years ago, in the early 90s when I was doing a training program in Belgium focused on small and medium enterprise management. I started in a front office in a hotel and met my wife. Both of us were working irregular hours at once, and that was not ideal. I was always interested in IT. My port of entry into the tech world was training people in Microsoft applications. From there I moved to a call center company, where I helped their outsourcing processes. At this moment outsourcing your support to third parties was very hyped.
My first step into the data management world was 15 years ago, when I joined Ansell, a US-based world-wide leader in personal protection equipment. I had the opportunity to manage a program in master data management (MDM). Everything was excel based and the organization wanted to implement global platform. We had to convert a manual process into a more robust approach.
Since then, I have held various consulting positions in data management and decided to specialize in data governance, which is a broad topic.
My experiences have given me a solid expertise in data governance. Three years back, I decided to leave the consultancy world and work for a software vendor and join DataGalaxy, a data catalog solution provider.
What advice do you have for people wanting to work in data governance?
There is no specific career path into data governance. There is no school to make you a data governance practitioner. It is about how you put yourself in the day to day world, how you position yourself around data management, data cataloging and how you build your skills.
Why have you decided to become a product and data governance evangelist?
At first I was not an evangelist. I was in charge of setting-up and running DataGalaxy’s professional services. It means I learned to master our platform and help our customers with the modeling of their use cases in the data catalog.
In the operational field, I learned a lot about the clients’ day-to-day challenges.
In parallel, I was posting once in a while on LinkedIn to share experience and expertise. A year back, I decided to post everyday and run the Data Governance Kitchen which allowed me to get around 8000+ followers. This was how I became a data governance evangelist externally.
Internally, we at DataGalaxy agreed that I would solely focus on product and data governance evangelisation to address couple of aspects:
What are the key responsibilities of a Product and Data Governance Evangelist?
- DataGalaxy’s brand awareness on the international market
- Representing the company in events, exhibitions, webinars and many more
- Being the voice of the market within DataGalaxy to help product teams understand our customer challenges and the value we can bring.
- Furthering the market fit from the product perspective.
What is the difference between data governance and data cataloging?
Data cataloging is a component of data governance supporting the identification and contextualization of your data assets. To learn more about the different components of data governance, read this article on TDAN.
What is metadata and why is it so important for data cataloging?
Metadata is information about your data. Let’s pick a concrete example. I’m passionate about food, where ingredients play a big role. If you enter your cold chamber and only have carton boxes on the shelves, which ones will you pick to prepare your dishes? These boxes are your data. So what do you do? You read the labels (12 cans of 250gr of peeled tomatoes flavored with garlic). This is metadata. Metadata is information and context about your date. Meatadata describes in detail the type of data you have on hand.
Data and metadata are interdependent.
Without metadata your catalog is useless. Without metadata, you can’t define, trace or contextualize your data items.
What can you learn about your organization from metadata in your data catalog?
A metadata-based data catalog brings several dimensions to your data assets:
- Technical description and parameters (type, length, Primary and foreign keys …)
- Business and /or operation classification and description of your data assets through verbiage in naming.
- The dataflow between databases is visible because the source and the target are documented as a property of a data items along with the applied transformation.
- Usage of the data and the data delivery mechanism (Dashboards, reports, …) because in a data catalog you will document this. The metadata of a report identifies the data sources used.
- Traceability based on the links between all the above points. Each point along the data’s journey is documented, thus the data lineage is traceable.
- Organizational information is also available on the data governance layer. This is contextualization of information. Identification of ownership or stewardship through enrichment that people perform in the data catalog on a day-to-day basis.
Can you have a good data catalog with messy metadata?
The quality of your data catalog is dependent on the quality of your metadata at source level.
It is crucial to properly document metadata in all your systems to get the maximum value out of your data catalog. You need to clean your metadata before you get a data catalog, otherwise your data catalog will never be accurate.
Of course you can and will enrich your data catalog with additional attributes or metadata, but the initial set of metadata should be of good quality.
When should a company think about data catalogs?
Now! It is never too early or too late.
What we’ve seen is that there is not specific size or industry to be equipped with a data catalog. Many small or mid sized organizations would simply start with an Excel or SharePoint to capitalize on peoples and operational knowledge.
The move to a more robust, online and collaborative data catalog comes naturally as a second step, as they gain data governance and operational maturity.
You can compare a data catalog to building a house. You need strong foundations prior to putting the floors and roof. Data catalog is your foundation to data management and governance.
Have you seen differences across geographies and industries?
Maturity varies across geographies and industries.
Highly regulated industries such as banking, finance, insurance and pharma have been using some kind of cataloging for many years. The main drivers were cleary the regulatory and compliance reports due to the authorities.
Other sectors and start-ups were able to develop their technical ecosystem in a more controlled way, free from legacy systems. Many SaaS applications allowed better understanding of data patrimony.
However, they also have an increasing need for data cataloging as they grow.
From a geographical standpoint, maturity in the US is higher than in Europe. And within Europe, Nordics a DACH might be a bit head compared to southern Europe.
What are the most important things to think about before setting off on a data catalog journey?
One important fact is that tooling should come last.
You need first, at executive level, to define your data driven vision and strategy. This will enable the rest of the organization to understand the WHY we should consider ourselves as being data driven.
Having executive support will give you necessary visibility at the board level to secure budget and resources.
This is where a Chief Data Officer CDO would play an important role.
Next step would be to create a task force supporting the execution of vision. It requires a multi-disciplinary set-up with IT, Business, Project Management and HR. HR is important, because data steward activities being recognized and incentivized is important for success.
The task force will be the ones talking to operational people to understand their day-to-day challenges that can be relieved by the implementation of a data catalog.
Only then, you can start defining the requirements for the technical solution and start selecting the most appropriate tool.
We need a combination of Top down and bottom up approach. The data catalog sits at the intersection.
What are some top challenges you see companies face when implementing data catalogs?
There are many challenges when embarking on your data governance and cataloging journey.
Let’s try to identify a top six,
- The lack of vision: As previously mentioned, defining the company vision of data is a critical factor in unlocking many opportunities, as well communicating the common goal across the entire organization and on the market.
- The lack of business involvement: We frequently see the data governance or catalog implementation being led by the IT organization. This probably fails 98% of the time. One of the guiding principles is the fact that data governance and data catalog should be owned by the business. They know the data, they consume and value it on a day to day basis.
And don’t get me wrong, IT is still around. They are supporting the initiative by helping on the data dictionary since the technically own the systems
- The lack of use case: buying a tool has never been helpful, if you don’t know the problem you want to solve and the value you want to create. This can only be defined and measured if you have use cases.
- The lack of human centricity: the implementation of a data catalog required a lot of change management activities. Most of the knowledge is in people’s heads. Breaking silos, extracting information, sharing knowledge can only be possible if you put the human at the center of your journey.
- The lack of communication: it is an enterprise program which deserves a lot of communication. Any opportunity to talk about data centricity will support the initiative and will show people its importance.
At an operational level, it is important to communicate about the implementation progress, the content already available, the roadmap of use cases … so people see the pace of progress.
- Last part is the evangelisation part. Preaching the good words on data centricity, data catalog and showcasing success will help big time!
When should a company not think about this solution?
Never. Having a data catalog in place is a must have. Look at companies such as Uber, Amazon or Facebook. They quickly learned the value of data and they have for sure a DataCatalog in place.
What first steps can a data leader and company take to approach this subject?
If I was one of these leaders, I would first talk with my peers in the industry or geography. This would allow me a better understanding of their challenges and the way they addressed the subject.
Second, I would initiate an internal roadshow at all levels to assess the organizational maturity and identify the first priorities since you’ll not be able to address everything at the same time.
Then get the buy-in and funding from the executive leadership
Based on that, create a task force to shape the program and define the roadmap
What has been your secret sauce to success?
Communication, humility, diplomacy, pragmatism, curiosity. and humor.
Who is Laurent Dresse?
Laurent Dresse, Chief Evangelist, brings his expert industry knowledge, experience, and determined energy to the table to help solve your company’s challenges. Holding a graduate degree in SME Management, Laurent began his career at Stefanini as a Solution Engineer. After six years, he became the Manager of European IT Support at Coca-Cola Enterprises, where he was a key player in establishing state-of-the-art support in manufacturing and office operations.
Today, Laurent is Datagalaxy’s top evangelist and thought leader, using his market expertise and observations to educate the public on key data governance topics.
With over ten years of experience, Laurent has successfully completed more than 20 international projects with Ansell, Cognizant, and Bearingpoint. Laurent will work effectively with your teams, listen to their ideas and concerns, and implement the necessary changes to make their Data Catalog initiative a great success!
From D3M Labs