6 Data Design Separating Data, Part 2

[MUSIC] Welcome back, everyone. In this lesson we're going to continue where we left off last time, and learn about how to design and structure our data tables for effective reuse and data analysis. Last time, we discussed one to many joins and foreign keys in our data tables. This time, we'll use that same concept of a foreign key where we specify a record in another table that has more information on the current record using its ID to associate much more data and open up new data possibilities. So now, let's continue with our example from last time. Before we wrap this up, let's make one more change to our table to help you understand the way that you might want to structure your data. Let's imagine for a moment that instead of one person going out to observe this, we have multiple people. And we want to know everyone who's out there so we can understand data quality. We can measure what's going on when we send specific people out into the field. Well, we could do this in a kind of a clunky way and just add a comma to the end of the Observer field and then add another name there. But that doesn't give us the data structure that allows us to effectively do things like joins and then go query our data in a structured way. To do that, we're going to need an entirely new table again. So just like we did before, we're going to take this field out and throw it in a new table here. So I'm going to create a new table And you might call it something like staff or field workers or something, but just to keep it consistent with what I was telling you before, I'm going to call it Observers. Keep it more generic so it can kind of include anybody. And once again, I need to start out my table with an ID field, a primary key of some sort that uniquely identifies each record. I'll just add those down for future records here. And then, Now we can even accommodate more information about these people. I might have First Name, And I might have Last Name. And we can even add something like an address field or payroll information, or whatever. And keep one table that has all the information about our people in it and then join that in when we need to. So we'll create that record for Liam here. And we'll create a record for Sarah here. And we'll create a record for Kelly here. And then we can have additional staff here like Rachel, Bill and on and on. And then I won't give them all last names, but we can have data for their last names in each of their records here, right? So at this point, to refactor my initial table of observations here I still just keep the one field, but instead of it being Observer, I'm going to erase their names here. But I'll do it one by one, so we can make the changes together. And then the field name would change to something like Observer ID, right, since we have Observers here and then we're just trying to reference that foreign table. So I'll write that a little smaller. ID, okay. So this first record was Liam. And again since we're just referencing this foreign table, we're going to do Observer ID Liam, and that'll be 1 here. And then if we have Sarah, That's going to be Observer ID 2. And then Kelly here, Observer ID 3. And then back to Sarah here, Observer ID 2. And then if we need to join them, which, again we can do that in ArcGIS or we could do it in a personal geodatabase. If we wanted to match these two together and we don't need the spatial information for a little while, we can open this up in Microsoft Access using a personal geodatabase. And we can just link these together using a standard join there and view them. And we'd get something if we were going to join it where this record here would get appended to the end of record 1, because Observer ID 1 corresponds to Liam. Foreign key of 1 to primary key of 1. And then we get Sarah getting appended here but also to this last record. And then Kelly getting appended over here. So remember when I said that we're going to try to get it to where we could mark multiple observers for an observation? We're still not there yet, right? Because since the Observer ID is here I only store one value and that references back here. What we did is basically what we did before with the sights and observations. We're going to do a more advanced thing now called the many to many join which allows us to associate many observers with many observations. To do that, we need still one more table. So I'm going to erase this join here, and we'll create that table. So, to make this work, I need to create an intermediary table that helps me relate these two together, and it just has two fields in it. It's going to say there's an observer and it was part of an observation. And we can do that in multiple different types of associations in it. So, let's create that field over here. And simple table. And we have Observer, ID and Observation, ID. And pretty straightforward. We know that Observer 1 was at Observation 1 or was present for Observation 1 and Observer 2 was present for Observation 2. And Observer 3 was present for Observation 3, or took Observation 3 potentially. But then also that Observer 2 did Observation 4, Which was all stuff we could collect with this table already, right? But where it's more powerful now is that we can say, well, maybe Observer 1 was also part of Observation 4. So we can just add that here. And now we know that two people were part of Observation 4. If we did relate in here we could relate this table, the Observations table, to this table here which I usually name something like both tables effectively so that we know where it's relating them. Observer's Observations, which is a confusing name in its own way, but. So, Observer's Observations, so we can join this table here over to the Observations table by Observation ID. Let's bring that around over here. Let me switch colors for that so you can see it a little better. So we'll take this Observation ID here and bring it around over here and join it to Observations table at 1. And this will just get appended to the end. Some database systems kind of transparently handle this table to join these to the end here, but for now let's visualize the whole system. And then we end up with Observation ID 2 getting appended over to 2 here. going to connect both sides just for space. And then 3 getting appended here to 3. And then both of these here getting appended to 4, which would duplicate that record temporarily. And then to the other side of this, now this table here is kind of visually on the end here. We have the Observers table here which we're going to link on Observer ID. So we'd have Liam, Observer ID 1, linked over here, and Sarah, Observer ID 2, linked here, and Kelly, Observer ID 3, linked here. And then we go back and link Sarah also to here, and we link Liam also to here. And the effect of this would be that once these are appended here and we're multiplying records again, that we have a table where we can access this name information associated with the record here. So, let's really quickly just trace that out as to what it would look like visually once you joined it, and kind of add this phantom name field to the end here. And we'll just do it for these top two records. So Observation 1 traces all the way over here to this record, and then we go to Observer ID 1, and that's Liam. So Liam was here. And then record ID 2 kind of comes down here and it turns out that it's Observer ID 2, and that's Sarah. So Sarah was there. And then number 3 traces back here like this, and that goes to Kelly. And then 4 is what we did all this for, right. With 4 we can trace it to both of these two records along these lines, and then we get numbers 1 and 2 here, right? So, in that case I write Liam, and then I get a duplicated record for Sarah down here. This will all make a lot more sense once you play with it. But use this as a conceptual guide to the concepts of what a join is and how you should think about structuring your data if you are getting new data from somewhere else. Or if you're sending people out into the field to collect data for your own projects. And then when you do it, play with it a little. See what happens when you attach data to other data and if it works out the way you expect or no. And then come back and reference this video after playing with it a bit so that you can solidify the concepts. Okay, so in this lecture, we talked about how to structure your data so that you can separate it out appropriately and join it back together in a way that lets you manage your data better in the event that you need to change things like names. And in a way that might let you also incorporate third-party data to your information here.

Wyszukiwarka

Podobne podstrony:
5 Data Design Separating Data, Part 1
function mailparse msg get part data
2 Data Design Attribute Types
4 Data Design Joins and Relates
Data shortcut
Data quotes
data sequence
data (3)
DOD Net Centric Data Strategy and Community of Interest (COI) Training Glossary
IEEE Finding Patterns in Three Dimensional Graphs Algorithms and Applications to Scientific Data M
SHSpec 224 6212C13 R2 12 Data Needle Behavior
rc183 010d data mining 1
function mysql data seek
Data Store Read
3 4 Screw Compressor Technical Data

więcej podobnych podstron