{"id":23198,"date":"2022-02-08T12:53:22","date_gmt":"2022-02-08T17:53:22","guid":{"rendered":"https:\/\/www.3pillarglobal.com\/?p=23198"},"modified":"2024-08-14T20:22:55","modified_gmt":"2024-08-14T20:22:55","slug":"navigating-data-bias","status":"publish","type":"blog","link":"https:\/\/www.3pillarglobal.com\/insights\/blog\/navigating-data-bias\/","title":{"rendered":"Navigating Data Bias"},"content":{"rendered":"\nIn the late 1970s, our military changed its scheduled maintenance approach to an on-demand philosophy. This lowered the lifecycle costs of many kinds of assets (including land, sea and air vehicles), and it allowed for spending our tax dollars on smarter maintenance activities rather than unnecessary repairs.\n<br><br>\nAs the program rolled out, the Air Force was initially skeptical when a young UCLA engineer informed them that the data they had sequestered for predictive failure analysis had a blind spot. That was how my first aerospace data science project began, and as we discuss below, blind spots are a common problem for many data science projects.\n<br><br>\n<h2>History Lesson for Today\u2019s Data Science Projects with Blind Spots<\/h2>\nIf you\u2019re currently managing a data science project and suspect it might also have a blind spot, let\u2019s first go back in history to World War II. A significant amount of ordinance was flown across the English Channel to Europe by the Allies. This campaign came at a tremendous cost in lives and equipment, and there was a strong desire to armor the aircraft to make them more resilient against enemy defenses.\n<br><br>\nThis case study is used to teach many data science students today about the WWII Allied statisticians who examined the aircraft that made it back from missions. The statisticians wanted to see where the bullet holes were on the planes to instruct aircraft designers to reinforce those areas.\n<br><br>\nMakes sense, right? Except for Abraham Wald, a Jewish mathematician who escaped Nazi advances into Romania and Europe to emigrate to the U.S. He served in the Statistical Research Group, a bunch of egg heads at Columbia University who used math to make the military better at everything from firing rockets to shooting down enemy fighters.\n<br><br>\nWald had the epiphany that it was the planes that didn\u2019t make it back that had the most important stories to tell. Their bullet holes were truly indicative of where armor should be added, rather than the ones that returned. This meant there was a survival bias in the data, which would lead to incorrect recommendations. (<a href=\"https:\/\/www.boredpanda.com\/world-war-2-aircraft-survivorship-bias-abraham-wald\/\" target=\"_blank\" rel=\"noopener\">Here\u2019s a great article<\/a> if you want to read more about this.)\n<br><br>\n<h2>Past Relates to Current-Day Project<\/h2>\nNow coming back to more recent times and a similar project\u2026I was presented with data from turbine engines, airframes, and other major aircraft components that had been disassembled. I also reviewed the maintenance logs from normal maintenance modes. But I could not see the data of the equipment that had failed\u2014badly, suddenly or catastrophically\u2014or any of the information leading up to those events. Exceptional maintenance events, especially catastrophic failures, often didn\u2019t make it into the normal log information. We only had the survivor cases. Attempts to integrate data from forensic sources proved difficult and could not be normalized, so we didn\u2019t have a data set for proper inferences.\n<br><br>\nTo solve this, we had to get innovative about engineering new data sources. For example, in the area of turbine engines, we had to work with a manufacturer to coat key engine components with different trace elements. Then, we had to run the engines over time, drain the oil periodically, and measure the content of the trace materials in the oil, which would relate to the rate of wear of the specific components.\n<br><br>\nNext, we ran the test set under varying conditions to see how it impacted wear and what the precursors of failure looked like. Once sufficient data was acquired, we knew how an engine would age based on time and temperature. We even learned that certain modes of vibration would indicate an impending failure.\n<br><br>\n<h2>Determining If Your Data is Biased<\/h2>\nNow to your data science project: Is your data survival biased? Has the culture and discipline around the measurements or metrics been unbiased? Will the data you plan to refine to make smarter decisions give you honest advice?\n<br><br>\nFortunately, there are specific industry sectors where this is a known problem, such as hedge funds, where only the funds that have survived are tracked and will likely give an upward bias to returns. Hedge fund companies take many steps to normalize their data so that they can clear any survivor bias.\n<br><br>\nThere are other kinds of data bias too:\n<br><br>\n<ul>\n \t<li><strong>Sample Selection Bias<\/strong> happens when certain data is excluded; for instance, the specifics around cases that didn\u2019t go as expected or those with a great deal of anomaly edits.<\/li>\n \t<li><strong>Look-Ahead Bias<\/strong> occurs when the test parameter uses information that was not available on the test date; for example, in a price-to-book ratio, calculations are made with a mix of real-time data and data that was not available until the end of a quarter.<\/li>\n \t<li><strong>Time Period Bias<\/strong> results when a particular trend only happens to the data within a selected time period; this means the period was not statistically significant enough from which to gather a general inference. (Sometimes, external events such as interim government regulations can unknowingly skew the data).<\/li>\n<\/ul>\nThere are even data sets with surprise measurement bias that was acquired by two different generations in which the test or measurement criteria were not the same. This can be caused by the measurement techniques, equipment calibration, training of technicians, and other factors. It\u2019s not apples to oranges\u2014it\u2019s more like Fuji Apples to Pink Lady Apples. They look the same but aren\u2019t.\n<br><br>\n<h2>Techniques to Test for Bias<\/h2>\nThe good news is, there are techniques that have been developed to test for the different types of data bias. Some use deep statistical tests, and some require testing algorithms against independent data sets to validate similar conclusions are achieved.\n<br><br>\nThe selected tests usually map to a particular kind of bias that is suspected. These bias tests are part-and-parcel of a properly-managed data science project. After all, you want to be sure you have \u201cunbiased data\u201d before you allow it to drive your critical business decisions.\n<br><br>\nIf you have an internal data science team ready for a project launch, it\u2019s prudent to take a close look at the data prior to commencing. If the data is not what you need, you may have to consider additional methods and costs for obtaining the required data.\n<br><br>\nOr, if an outside organization will deliver a turn-key data science driven process or a black-box machine learning algorithm, be sure they take the time to survey your data for bias early in the lifecycle of the project. This might have implications on the data size (storage costs) and the required computational strength to achieve the inferences you are looking for.\n<br><br>\nIn the meantime, our sincere best wishes for you to make history with your data science venture!\n<br><br>\nIllustration: <a href=\"https:\/\/commons.wikimedia.org\/wiki\/File:Survivorship-bias.svg\">Martin Grandjean (vector), McGeddon (picture), Cameron Moll (concept)<\/a>, <a href=\"https:\/\/creativecommons.org\/licenses\/by-sa\/4.0\">CC BY-SA 4.0<\/a>, via Wikimedia Commons<\/p>\n\n\n\n<div style=\"height:100px\" aria-hidden=\"true\" class=\"wp-block-spacer is-style-small\"><\/div>\n\n\n<div class=\"wp-block-heading\">\n<h2 class=\"wp-block-heading\">About the author<\/h2>\n<\/div>\n\n\n<div style=\"height:20px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n<div class=\"wp-bootstrap-blocks-row row\">\n\t\n\n<div class=\"col-12 col-lg-6\">\n\t\t\t\n\n    <div  class=\"custom-block card-profile-block card card-profile card-image-offset card-image-offset-right card-image-offset-sm card-border-left text-bg-light-cyan\">\n        <div class=\"card-body\">\n            <div class=\"row flex-md-row-reverse\">\n                <div class=\"col-md-5\">\n                                            <img loading=\"lazy\" decoding=\"async\" width=\"532\" height=\"532\" src=\"https:\/\/www.3pillarglobal.com\/wp-content\/uploads\/2024\/07\/leadership-henry-martinez.jpg\" class=\"img-fluid\" alt=\"Henry Martinez portrait\" data-aos=\"none\" srcset=\"https:\/\/www.3pillarglobal.com\/wp-content\/uploads\/2024\/07\/leadership-henry-martinez.jpg 532w, https:\/\/www.3pillarglobal.com\/wp-content\/uploads\/2024\/07\/leadership-henry-martinez-300x300.jpg 300w, https:\/\/www.3pillarglobal.com\/wp-content\/uploads\/2024\/07\/leadership-henry-martinez-172x172.jpg 172w\" sizes=\"auto, (max-width: 532px) 100vw, 532px\" \/>                                    <\/div>\n\n                <div class=\"col-md-7\">\n                    <h3 class=\"card-title\">Henry Martinez<\/h3>\n                    \n                                            <p class=\"card-text\">Senior Director, Global Head of Solutions, Engineering &#038; Architecture<\/p>\n                                        \n                    <a href=\"https:\/\/www.3pillarglobal.com\/?post_type=leadership&#038;p=1638\" class=\"link-arrow link-arrow-dark\">\n                        <span>Read bio<\/span>\n                    <\/a>\n                <\/div>\n            <\/div>\n        <\/div>\n    <\/div>\n\n\t<\/div>\n\n\n\n<div class=\"col-12 col-lg-6\">\n\t\t\t\t<\/div>\n\n<\/div>\n\n\n\n<div style=\"height:100px\" aria-hidden=\"true\" class=\"wp-block-spacer is-style-small\"><\/div>\n","protected":false},"excerpt":{"rendered":"<p>In the late 1970s, our military changed its scheduled maintenance approach to an on-demand philosophy. This lowered the lifecycle costs of many kinds of assets (including land, sea and air vehicles), and it allowed for spending our tax dollars on smarter maintenance activities rather than unnecessary repairs. As the program rolled out, the Air Force [&hellip;]<\/p>\n","protected":false},"featured_media":28484,"template":"","industry-types":[],"service-types":[49,48],"topics":[28],"class_list":["post-23198","blog","type-blog","status-publish","has-post-thumbnail","hentry"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v25.4 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Navigating Data Bias - 3Pillar<\/title>\n<meta name=\"description\" content=\"In this article, 3Pillar provides guidance to determine if your data is biased, and reveals strategies for resolving any bias that exists.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.3pillarglobal.com\/insights\/blog\/navigating-data-bias\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Navigating Data Bias - 3Pillar\" \/>\n<meta property=\"og:description\" content=\"In this article, 3Pillar provides guidance to determine if your data is biased, and reveals strategies for resolving any bias that exists.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.3pillarglobal.com\/insights\/blog\/navigating-data-bias\/\" \/>\n<meta property=\"og:site_name\" content=\"3Pillar\" \/>\n<meta property=\"article:modified_time\" content=\"2024-08-14T20:22:55+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.3pillarglobal.com\/wp-content\/uploads\/2024\/07\/navigating-data-bias.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1208\" \/>\n\t<meta property=\"og:image:height\" content=\"680\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"6 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.3pillarglobal.com\/insights\/blog\/navigating-data-bias\/\",\"url\":\"https:\/\/www.3pillarglobal.com\/insights\/blog\/navigating-data-bias\/\",\"name\":\"Navigating Data Bias - 3Pillar\",\"isPartOf\":{\"@id\":\"https:\/\/www.3pillarglobal.com\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.3pillarglobal.com\/insights\/blog\/navigating-data-bias\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.3pillarglobal.com\/insights\/blog\/navigating-data-bias\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.3pillarglobal.com\/wp-content\/uploads\/2024\/07\/navigating-data-bias.jpg\",\"datePublished\":\"2022-02-08T17:53:22+00:00\",\"dateModified\":\"2024-08-14T20:22:55+00:00\",\"description\":\"In this article, 3Pillar provides guidance to determine if your data is biased, and reveals strategies for resolving any bias that exists.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.3pillarglobal.com\/insights\/blog\/navigating-data-bias\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.3pillarglobal.com\/insights\/blog\/navigating-data-bias\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.3pillarglobal.com\/insights\/blog\/navigating-data-bias\/#primaryimage\",\"url\":\"https:\/\/www.3pillarglobal.com\/wp-content\/uploads\/2024\/07\/navigating-data-bias.jpg\",\"contentUrl\":\"https:\/\/www.3pillarglobal.com\/wp-content\/uploads\/2024\/07\/navigating-data-bias.jpg\",\"width\":1208,\"height\":680,\"caption\":\"Navigating Data Bias\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.3pillarglobal.com\/insights\/blog\/navigating-data-bias\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.3pillarglobal.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Navigating Data Bias\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.3pillarglobal.com\/#website\",\"url\":\"https:\/\/www.3pillarglobal.com\/\",\"name\":\"3Pillar\",\"description\":\"Together we create incredible\",\"publisher\":{\"@id\":\"https:\/\/www.3pillarglobal.com\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.3pillarglobal.com\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.3pillarglobal.com\/#organization\",\"name\":\"3Pillar\",\"url\":\"https:\/\/www.3pillarglobal.com\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.3pillarglobal.com\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/qa-www.3pillarglobal.com\/wp-content\/uploads\/2024\/08\/3pillar-organization-logo.png\",\"contentUrl\":\"https:\/\/qa-www.3pillarglobal.com\/wp-content\/uploads\/2024\/08\/3pillar-organization-logo.png\",\"width\":696,\"height\":696,\"caption\":\"3Pillar\"},\"image\":{\"@id\":\"https:\/\/www.3pillarglobal.com\/#\/schema\/logo\/image\/\"}}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Navigating Data Bias - 3Pillar","description":"In this article, 3Pillar provides guidance to determine if your data is biased, and reveals strategies for resolving any bias that exists.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.3pillarglobal.com\/insights\/blog\/navigating-data-bias\/","og_locale":"en_US","og_type":"article","og_title":"Navigating Data Bias - 3Pillar","og_description":"In this article, 3Pillar provides guidance to determine if your data is biased, and reveals strategies for resolving any bias that exists.","og_url":"https:\/\/www.3pillarglobal.com\/insights\/blog\/navigating-data-bias\/","og_site_name":"3Pillar","article_modified_time":"2024-08-14T20:22:55+00:00","og_image":[{"width":1208,"height":680,"url":"https:\/\/www.3pillarglobal.com\/wp-content\/uploads\/2024\/07\/navigating-data-bias.jpg","type":"image\/jpeg"}],"twitter_card":"summary_large_image","twitter_misc":{"Est. reading time":"6 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.3pillarglobal.com\/insights\/blog\/navigating-data-bias\/","url":"https:\/\/www.3pillarglobal.com\/insights\/blog\/navigating-data-bias\/","name":"Navigating Data Bias - 3Pillar","isPartOf":{"@id":"https:\/\/www.3pillarglobal.com\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.3pillarglobal.com\/insights\/blog\/navigating-data-bias\/#primaryimage"},"image":{"@id":"https:\/\/www.3pillarglobal.com\/insights\/blog\/navigating-data-bias\/#primaryimage"},"thumbnailUrl":"https:\/\/www.3pillarglobal.com\/wp-content\/uploads\/2024\/07\/navigating-data-bias.jpg","datePublished":"2022-02-08T17:53:22+00:00","dateModified":"2024-08-14T20:22:55+00:00","description":"In this article, 3Pillar provides guidance to determine if your data is biased, and reveals strategies for resolving any bias that exists.","breadcrumb":{"@id":"https:\/\/www.3pillarglobal.com\/insights\/blog\/navigating-data-bias\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.3pillarglobal.com\/insights\/blog\/navigating-data-bias\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.3pillarglobal.com\/insights\/blog\/navigating-data-bias\/#primaryimage","url":"https:\/\/www.3pillarglobal.com\/wp-content\/uploads\/2024\/07\/navigating-data-bias.jpg","contentUrl":"https:\/\/www.3pillarglobal.com\/wp-content\/uploads\/2024\/07\/navigating-data-bias.jpg","width":1208,"height":680,"caption":"Navigating Data Bias"},{"@type":"BreadcrumbList","@id":"https:\/\/www.3pillarglobal.com\/insights\/blog\/navigating-data-bias\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.3pillarglobal.com\/"},{"@type":"ListItem","position":2,"name":"Navigating Data Bias"}]},{"@type":"WebSite","@id":"https:\/\/www.3pillarglobal.com\/#website","url":"https:\/\/www.3pillarglobal.com\/","name":"3Pillar","description":"Together we create incredible","publisher":{"@id":"https:\/\/www.3pillarglobal.com\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.3pillarglobal.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.3pillarglobal.com\/#organization","name":"3Pillar","url":"https:\/\/www.3pillarglobal.com\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.3pillarglobal.com\/#\/schema\/logo\/image\/","url":"https:\/\/qa-www.3pillarglobal.com\/wp-content\/uploads\/2024\/08\/3pillar-organization-logo.png","contentUrl":"https:\/\/qa-www.3pillarglobal.com\/wp-content\/uploads\/2024\/08\/3pillar-organization-logo.png","width":696,"height":696,"caption":"3Pillar"},"image":{"@id":"https:\/\/www.3pillarglobal.com\/#\/schema\/logo\/image\/"}}]}},"_links":{"self":[{"href":"https:\/\/www.3pillarglobal.com\/wp-json\/wp\/v2\/blog\/23198","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.3pillarglobal.com\/wp-json\/wp\/v2\/blog"}],"about":[{"href":"https:\/\/www.3pillarglobal.com\/wp-json\/wp\/v2\/types\/blog"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.3pillarglobal.com\/wp-json\/wp\/v2\/media\/28484"}],"wp:attachment":[{"href":"https:\/\/www.3pillarglobal.com\/wp-json\/wp\/v2\/media?parent=23198"}],"wp:term":[{"taxonomy":"industry-types","embeddable":true,"href":"https:\/\/www.3pillarglobal.com\/wp-json\/wp\/v2\/industry-types?post=23198"},{"taxonomy":"service-types","embeddable":true,"href":"https:\/\/www.3pillarglobal.com\/wp-json\/wp\/v2\/service-types?post=23198"},{"taxonomy":"topics","embeddable":true,"href":"https:\/\/www.3pillarglobal.com\/wp-json\/wp\/v2\/topics?post=23198"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}