[MUSIC] Welcome back everyone. In this lesson, we're going to cover data
structures and joins in database tables. This sounds maybe a little more
technical than some of you thought you signed up for. But it's a really critical component
of GIS that you should understand. And it's easy to understand at
the level that you need to know it. To begin with, let's take a data table
similar to something that we might have in my line of work of watershed sciences. And it's going to be field data which I
hope you can easily adapt to the type of work you're doing. It's where someone goes out and collects
some information using a sensor and a GPS at some remote location and then
brings it back and needs to analyze it. So we'll start by creating a table here that'll look just like your
data tables in ArcGIS. And I'm going to create a column here. I'm just going to call it ID. That'll be our primary key column
like you're used to from the datasets you've been working with. And then, if we think about it, probably
the next most basic column we need in a GIS is a location field of some sort. We're used to that being
called Shape in ArcGIS, but you could have your latitude and longitude
coordinates or something like that. For now,
I'm going to call it Shape just so it mirrors what you'll
probably see in ArcGIS. And then, what's left is
the data we actually captured when we sent somebody out into the field. So in this case,
we're going to have Velocity or for stream velocity, the measure of flow. And then we'll have Depth for
the stream depth at that location and then we may also want to
know who took that measurement. So I'm going to create
a field called Observer here. And that's who was sent out in the field. That, in this case,
could be something like the name. So let's create a couple records. I'm just going to give them
IDs really quickly, 1, 2, 3 and Shape will just be
some sort of binary data. We can't actually read but
ArcGIS can display it for us and that'll go all the way through there. And then for Velocity and Depth,
I'm just going to make up some numbers so we can work with them. 12, 8, 7 and
Depth in whatever units they're collecting in, might be 4, 17, 8. Something like that. And then Observer,
give these people names. Liam, Sarah and Kelly. Okay, so
we have a really basic data table here. Records our information in a way
that seems acceptable to me and doesn't really have any special cases. We could just keep adding more and more
records, more locations, more velocity, more depth, more observers. But where it starts to get concerning is
as we grow out in specific directions, we run into potential complications where
we want to start separating our data into multiple tables that we bring back
together with things called joins or relates in ArcGIS. In database parlance, it tends to be
joins across table relationships, but in ArcGIS there are joins and relates,
which we'll explain at some point. So, the next thing we might want to
do with this data table would be to add a field indicating when
this information was collected so that we can track
observations across time. So I'm just going to remove the end here,
So we can extend our table
out a little further. And add another field called Date. As a side note, you don't generally
want to actually call your fields Date. That's what's called the reserved word and
the database might choke on it. You might call it something
like observation date. But for this particular example we'll
call it Date just to keep it short. And I'll put 2016-01-14, which, it says 14, for today's date. And then we'll do 2015-12-7-07 there and then we'll do 2015-11-18 or something like that for
those observation dates. If I start adding more records
here with more dates and more locations, this is fine. But think about what
happens if I start wanting to record multiple observations for
the same location here. I start duplicating information
about my location and if we do that, we're going to get multiple features
in ArcGIS at the same location. And they're going to overlap each other
and that's going to be inconvenient. Because if all the sudden I want to
say edit the name of a location or the actual location I need to correct it,
then we run into issues where we have to do that across multiple records
and that's time consuming and error-prone. So what do we do about it? We're going to use that thing
called joins we were talking about. And we're going to create a new table and
separate it out so that we can attach it back when we need it but that the data
is kind of in its own object. We have data tables that have a theme,
that they can characterize a specific phenomenon and then we can bring them
together to understand something larger. So, I'll do that right over here for
now, and we'll call this table Observations. And then I'll create a table
over here that I'll call Sites. And in the Sites table, I'm also going to
have an ID, just like in the Observations table, but it's going to be an ID
that refers to its own things. And, in that Sites table, we're basically going to take this
Shape field and move it over here too. So, we'll have ID,
we're going to have Shape and maybe we'll have a site name now
that we have its own table for it. Site Name, and we'll draw those fields
down and I'll give it a few IDs here. 3, 4, and
then this is still binary data here. It represents feature
information in ArcGIS. And then we have a site name. I'm going to call them Site A and Site B. And maybe this one will be like
B downstream or something. Indicating it's still near B,
but it's not exactly at B. And then we'll call this one Site C. Now that I have it in this table, I can
remove it from the Observations table. And we'll just strip off
this whole field in here, And we'll rewrite that in a moment. We're going to modify this table now. In programming terms, a lot of times
that's called refactoring, when you take something you already have and you change
it to update it to your new parameters. You call that refactoring. So we're going to have two fields still, because we still need an ID field for
our Observations. And if you were going to do this in place, they'd keep their same existing
IDs if we were moving it out. The Sites would get this new ID and our Observations field would have their
same IDs so we have our ID field. And then we have our Site_ID field. And what we get with that is we're
going to reference the ID field of the Sites table in our Site_ID
field to say, this Velocity, this Depth measurement occurred
at the site with this ID. And then, we're going to bring those
together in ArcGIS when we need to know information about the site,
such as its location or its name. And in ArcGIS in particular, we'd actually
go in the direction of starting with the sites table that has the location and attaching this information
in that direction. And a lot of database systems,
you might start on this other side. So we'll reference it back and
say that this velocity and flow measurement occurred
at Site with Site_ID 1. And this one occurred at
Site with Site_ID 2 and this one occurred at Site with Site_ID 3. These happen to correspond right now
just to keep it consistent with what it was before. But we can add a new record
down here with ID 4 and let's say that this one
also happened at SiteID 2. So we have multiple records
at the same site and now we can attach those
back to one site over here. And 11 and 13 and let's say this one also was done by Sarah. And it occurred on some date. So, now we have additional records
to show that these aren't unique. This ID field is a reference
to this ID field. And one way that we can keep
this straight in our heads is if you make your table names plural, so
instead of just being Site, this is Sites. This has multiple different sites in
this table and then our foreign key here which references the key field,
the ID field in a foreign table. It starts with the non-plural version of
the table name and then ends with ID. You can come up with your own system. This isn't a rule. It's just a way to keep it straight so
that if I send this to somebody, they automatically know without much
additional documentation that the Site_ID field probably references the ID field
in Sites and can be joined that way. Now, for a moment,
just to make this a little stronger for you, let's take a moment to
visualize how that join would occur. If I have these Sites over here and
these Observations. If I have Sites in ArcGIS and I want to
attach the data, I'd use the join tool, which we'll show you in a screencast. I'd use the join tool and
attach this information. And how that would work is record
here with Site_ID gets appended to the end of the record for Site_ID 1. And the record here with Site_ID 2 that
occurred at Site_ID 2 rather goes and gets appended to Site_ID 2. And same for Site_ID 3. But then what happens, and
this is what makes this something called a one-to-many join, is this one,
since it also has Site_ID of 2, in ArcGIS, we'd use a relate to connect both of
these records with Site_ID 2 here. And get the records to show up for
us in our attribute table. Where this information with the Shape
shows up with both of these records. And that way we can go query for
information at Site_ID 2 and see everything that's happening
at Site_ID 2 in one shot. I'm going to erase these now so
you can see me again. We've covered a number of concepts in this
video already, so let's pause it here and we'll continue on with some more
advanced concepts in the next video. In this video, we covered how to
separate our data into multiple tables with a type of join called
a one-to-many join. We utilized primary keys and foreign keys to reference data in the
Sites table from our Observations table. And add additional information
to our observation records. In the next lesson,
we'll make this even more powerful with a new type of join
called the many-to-many join, where we associate large quantities of
new data with our Observations table.
Wyszukiwarka
Podobne podstrony:
6 Data Design Separating Data, Part 2function mailparse msg get part data2 Data Design Attribute Types4 Data Design Joins and RelatesData shortcutData quotesdata sequencedata (3)DOD Net Centric Data Strategy and Community of Interest (COI) Training GlossaryIEEE Finding Patterns in Three Dimensional Graphs Algorithms and Applications to Scientific Data MSHSpec 224 6212C13 R2 12 Data Needle Behaviorrc183 010d data mining 1function mysql data seekData Store Read3 4 Screw Compressor Technical Datawięcej podobnych podstron