{"id":1355,"date":"2023-10-18T16:51:29","date_gmt":"2023-10-18T16:51:29","guid":{"rendered":"https:\/\/d3mlabs.de\/?p=1355"},"modified":"2023-11-02T14:29:29","modified_gmt":"2023-11-02T14:29:29","slug":"how-large-language-models-will-transform-data-operations","status":"publish","type":"post","link":"https:\/\/d3mlabs.de\/?p=1355","title":{"rendered":"How Large Language Models are Transforming Data Operations"},"content":{"rendered":"\n<p>Large language models are disrupting data operations, as well as how data teams are structured and work together. <\/p>\n\n\n\n<p>I (<a href=\"https:\/\/www.linkedin.com\/in\/elizabethpress\/\" title=\"\">Elizabeth Press<\/a> from D3M Labs) spoke with <a href=\"https:\/\/www.linkedin.com\/in\/leonid-nekhymchuk\/\" title=\"\">Leonid Nekhymchuk<\/a> (Leo), CEO and Co-Founder of <a href=\"https:\/\/datuum.ai\/\" title=\"\">Datuum.ai<\/a> about how large language models will transform data operations.&nbsp; Datuum uses AI to connect data sources with target models, automate mapping, making data integration less time-consuming and less expensive.&nbsp;<\/p>\n\n\n\n<p>Leo and my discussion in the upcoming video focuses on how LLMs help with data integration.&nbsp;<\/p>\n\n\n\n<p>This article is an overview of some of the topics Leo and I covered, as well as my own perspective.<\/p>\n\n\n\n<p class=\"has-medium-font-size\"><strong>How will large language models impact DataOperations?<\/strong><\/p>\n\n\n\n<p>Leo outlined a couple of high-impact use cases, which he will outline in the video.&nbsp;<\/p>\n\n\n\n<p>Data integration stands out as one of the most formidable challenges in the data ecosystem. It requires a hard-to-find combination of managerial and technical expertise, making it a scarce skill-set to acquire and costly to do well. The process itself is intricate and prone to issues such as pipeline disruptions, system malfunctions, and faulty code, which can all compromise data quality and erode trust in the data.<\/p>\n\n\n\n<p class=\"has-medium-font-size\"><strong>I asked ChatGPT about the impact of large language models on data operations.&nbsp;Some use cases beyond DataOperations included:<\/strong><\/p>\n\n\n\n<p><em>Analytical tasks such as data summarization and exploration<\/em>. Many BI tools are integrating ChatGPT functionalities that enable natural language queries that result in visualizations, bypassing SQL.&nbsp;<\/p>\n\n\n\n<p><em>Anonymization, and improving privacy <\/em>through replacing personally identifiable information (PII) with synthetic, privacy-preserving placeholders was one use case it told me.&nbsp;<\/p>\n\n\n\n<p><em>Security and threat monitoring <\/em>is another use case. Large language models can help identify threats and anomalies in the data.&nbsp;<\/p>\n\n\n\n<p><em>Training and documentation.<\/em> Many data teams are often busy and underwater with tasks. Documentation is one enabler to smooth operations that is often victim to de-prioritization to more \u201curgent\u201d tasks. Large language models will help us automate that! Sphynx and Confluence are two of my favorites.&nbsp;<\/p>\n\n\n\n<p class=\"has-medium-font-size\"><strong>Will large language models kill jobs in data?&nbsp;<\/strong><\/p>\n\n\n\n<p>This blog shall contain no spoiler alerts! Watch the video coming out on November 2nd.&nbsp;&nbsp;<\/p>\n\n\n\n<p>Job profiles and team composition will be different. When simple data tasks are done quicker, the focus will be placed on generating insights.&nbsp;<\/p>\n\n\n\n<p class=\"has-medium-font-size\"><strong>To all the people asking me about junior jobs:<\/strong><\/p>\n\n\n\n<p>Leo offered some advice about upskilling yourself on LLMs and how adoption in enterprises has gaps.&nbsp;<\/p>\n\n\n\n<p class=\"has-medium-font-size\"><strong>Watch the conversation between Leo and Elizabeth on YouTube<\/strong><\/p>\n\n\n\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><div class=\"wp-block-embed__wrapper\">\n\t\t<div class=\"embed-privacy-container is-disabled embed-youtube\" data-embed-id=\"oembed_f31b4b9c4b8bd3f1490778724f15573b\" data-embed-provider=\"youtube\" style=\"aspect-ratio: 500\/281;\">\t\t\t\t\t\t<button type=\"button\" class=\"embed-privacy-enable screen-reader-text\">\u201eHow Large Language Models are Transforming DataOps\u201c von YouTube anzeigen<\/button>\t\t\t\t\t\t<div class=\"embed-privacy-overlay\">\t\t\t\t<div class=\"embed-privacy-inner\">\t\t\t\t\t<div class=\"embed-privacy-logo\" style=\"background-image: url(https:\/\/d3mlabs.de\/wp-content\/plugins\/embed-privacy\/assets\/images\/embed-youtube.png?ver=1.12.4);\"><\/div>\t\t<p>\t\tHier klicken, um den Inhalt von YouTube anzuzeigen.\t\t\t\t\t<br>\t\t\t\t\tErfahre mehr in der <a href=\"https:\/\/policies.google.com\/privacy?hl=de\" target=\"_blank\">Datenschutzerkl\u00e4rung von YouTube<\/a>.\t\t<\/p>\t\t<p class=\"embed-privacy-input-wrapper\">\t\t\t<input id=\"embed-privacy-store-youtube-f31b4b9c4b8bd3f1490778724f15573b\" type=\"checkbox\" value=\"1\" class=\"embed-privacy-input\" data-embed-provider=\"youtube\">\t\t\t<label for=\"embed-privacy-store-youtube-f31b4b9c4b8bd3f1490778724f15573b\" class=\"embed-privacy-label\" data-embed-provider=\"youtube\">\t\t\t\tInhalt von YouTube immer anzeigen\t\t\t<\/label>\t\t<\/p>\t\t\t\t\t\t<\/div>\t\t\t\t\t\t\t\t<div class=\"embed-privacy-footer\"><span class=\"embed-privacy-url\"><a href=\"https:\/\/youtu.be\/Fkg2T_py0Ks\">\u201eHow Large Language Models are Transforming DataOps\u201c direkt \u00f6ffnen<\/a><\/span><\/div>\t\t\t<\/div>\t\t\t\t\t\t<div class=\"embed-privacy-content\">\t\t\t\t<script>var _oembed_f31b4b9c4b8bd3f1490778724f15573b = '{\\\"embed\\\":\\\"&lt;iframe title=&quot;How Large Language Models are Transforming DataOps&quot; width=&quot;500&quot; height=&quot;281&quot; src=&quot;https:\\\\\/\\\\\/www.youtube-nocookie.com\\\\\/embed\\\\\/Fkg2T_py0Ks?feature=oembed&quot; frameborder=&quot;0&quot; allow=&quot;accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share&quot; referrerpolicy=&quot;strict-origin-when-cross-origin&quot; allowfullscreen&gt;&lt;\\\\\/iframe&gt;\\\"}';<\/script>\t\t\t<\/div>\t\t<\/div>\t\t\n<\/div><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p><span style=\"text-decoration: underline;\">Related articles<\/span><\/p>\n\n\n\n<p><a href=\"https:\/\/d3mlabs.de\/?p=443\" title=\"\">Is scary data pipeline technical debt haunting your business?<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Large language models and generative AI  are disrupting how data is done. I (Elizabeth Press from D3M Labs) spoke with Leonid Nekhymchuk (Leo), CEO and Co-Founder of Datuum.ai, about how large language models will transform data operations.\u00a0 Datuum uses AI to connect data sources with target models, automate mapping, making data integration less time-consuming and less expensive.\u00a0 <\/p>\n","protected":false},"author":1,"featured_media":1369,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"_uf_show_specific_survey":0,"_uf_disable_surveys":false,"footnotes":""},"categories":[19,44,13,57,56],"tags":[],"class_list":["post-1355","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-strategy","category-data-pipelines","category-dataops","category-generative-ai","category-large-language-models","wpcat-19-id","wpcat-44-id","wpcat-13-id","wpcat-57-id","wpcat-56-id"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/d3mlabs.de\/index.php?rest_route=\/wp\/v2\/posts\/1355","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/d3mlabs.de\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/d3mlabs.de\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/d3mlabs.de\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/d3mlabs.de\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1355"}],"version-history":[{"count":10,"href":"https:\/\/d3mlabs.de\/index.php?rest_route=\/wp\/v2\/posts\/1355\/revisions"}],"predecessor-version":[{"id":1371,"href":"https:\/\/d3mlabs.de\/index.php?rest_route=\/wp\/v2\/posts\/1355\/revisions\/1371"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/d3mlabs.de\/index.php?rest_route=\/wp\/v2\/media\/1369"}],"wp:attachment":[{"href":"https:\/\/d3mlabs.de\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1355"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/d3mlabs.de\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1355"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/d3mlabs.de\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1355"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}