4 Data Design Joins and Relates


[MUSIC] Hello again, and welcome back, everyone. In this lesson, we're going to talk about how to separate data, and join it back together. We'll cover why you might separate your data into multiple tables, and the mechanics of how it works including primary and foreign keys, joints, and relates. These concepts are important both for keeping your data clean, and for obtaining data and using it from third parties. So first, imagine a situation where you go out into the field in whatever your discipline is and you collect some data on a GPS. You collect the location and its name as well as a couple other relevant parameters. Now, imagine that you go back a second time to those same locations and collect some other parameters. We need an effective way to attach that new data Back to the original data, so that they can be used and analyzed in one feature class. Joins are the tool that allow us to do things like this. A join is a general concept for taking two separate data tables and bringing them together into one, appending the attributes of one onto the records of another. For all of these, they share a common attribute of some sort that helps us to know which attributes to attach to which records. How we do this varies. Sometimes we use a common attribute between tables as in a table join. Sometimes we use their location in space as in a spacial join, and sometimes we even use a fuzzy match on the time the records represent. For now, let's just talk about table joins. When I say table joins, I'm also referring to feature classes. Since feature classes are built on tables, they are a superset of tables but can also be viewed as just a table. Joins aren't specific to GIS. Instead, they're a feature common to all relational database management systems or RDBMSs. Working with databases doesn't have to be scary though, and it can be a lot of fun too. The core concept of relational database systems, from which it gets the word relational in its name, is that you have multiple tables which you can relate to each other through common attributes. Some common relational database systems you may or may not have heard of are Microsoft Access, MySQL, or Microsoft SQL Server. While providing different technologies they all provide the same core functionality for storing and working with data and allow access to data through structured query language commands or SQL which is the same code you write when you work with select by attributes, label classes, and definition queries in ArcGIS. For table joins we have both primary keys and foreign keys. These are the attributes we use to connect your tables. Regardless of joining data a table almost always has a primary key and these are usually numbers. This enables us to use uniquely reference records in the table Through an attribute that doesn't mean anything to humans, that's important, so that it wont change, when we change our minds or get new data. Imagine without a primary key, if we tried to reference these records by their names but then we were to change the names. Everything that references them by name is now out of date and broken. In contrast, having a primary key allows us to change any attributes that are actually important to us in a record, while still having a means to reference that record from somewhere else that won't change. Foreign keys are where we reference that information from somewhere else, like another table. If we have that secondary data table I mentioned earlier where went out and took additional data on a different date, the foreign key allows us to reference the original data directly in the data record of the new data. It's called the foreign key because it allows us to reference a primary key and a table that is foreign to the current table. Together, the primary key and the foreign key provide the basic building blocks for putting together the rest of table joins. Well, there are a handful of different types of table joins, they all require primary and foreign keys. Explicitly, they are one to one, one to many, and many to many joins. And we'll discuss there in more depth in the next couple lectures as we build up a data model based upon the concepts of joins. Now, we don't only need joins to bring data together just because we collected it at different times or have a complex data structure. Sometime we're working on part of the data ourselves, and someone else is working on another part. Or you build some data on top of someone elses standard data set and you need to reference it so that your data can be correctly applied. An example from my own work, is how we work with river data. The definitive river data set in the United States is one called NHDPlus that's freely published and available. If I want to build a data set base on that river layer I need to build the data table that references the primary key of each NHDPlus feature as a foreign key in my own data tables. Maybe I've taken flow measurement and want to build the data set of river flows. And as a field in my data set, I'll include a foreign key identifying the record in NHDPlus features that each record in my data set applies to. From here, we can start to build more complex data structures. And sometimes we need to represent these with relationship diagrams where we draw lines between listing of the attributes in each table to show how the tables reference and relate to each other. This helps us understand how our data is separated in a way that we can understand at a glance. In ArcGIS we make a distinction between joins and relates from the perspective of other database systems relates are just a specific type of join. So I'm going to talk about them conceptually in this class as if they're the same. But know that any time that multiple records in a foreign table get appended to a single record in the destination feature class, the primary table, ArcGIS needs to perform a relate something it calls a relate instead, and the capabilities of our data change. This is largely because duplicating records in the primary table duplicates the spatial parts of the data, which doesn't make sense for many workflows. The last thing I want to reiterate in this lecture is just a reminder of how we do joins in ArcGIS. You can right-click on a layer and select add join. From there, you can choose the layer from which to join information and the common fields that the two layers have. At that point information will be appended to the feature class from the foreign table. If you need to see this process in action review the lecture on performing joins in ArcGIS from the first class in this specialization. Okay, that's it for now. In summary, we talked about separating your data into multiple tables and then using joins to link these tables together with primary and foreign keys. Remember, that a primary key is a unique reference for a record. Typically, a number that is automatically generated by a database that doesn't mean anything to us. While a foreign key, allows us to reference a primary key in a table that is foreign to the current table. Relates are an ArcGIS specific feature that are a specific type of join and they occur whenever we need to attach multiple records from one table to a single record in another table. In the next lecture, we're going to build on what we talked about here and go through our own example of how to take a data set that we have in just a single table and slowly expand it out into these more complex data structures and work through types of joins we need in the process. I think it will be interesting, and it will teach you a lot. See you there.

Wyszukiwarka

Podobne podstrony:
Data scipio?mpaign regions and settlement names
Healthy eating for people with depression, anxiety and related disorders
Darrieus Wind Turbine Design, Construction And Testing
6 Data Design Separating Data, Part 2
2 Data Design Attribute Types
Data imperial?mpaign regions and settlement names
5 Data Design Separating Data, Part 1
Sketching in early conceptual phases of product design guidelines and tools
Design and performance optimization of GPU 3 Stirling engines
DOD Net Centric Data Strategy and Community of Interest (COI) Training Glossary
IEEE Finding Patterns in Three Dimensional Graphs Algorithms and Applications to Scientific Data M
Before And After Magazine Pure Design
Recommended Reading List naval analyst and wargame designer
Introducing the ICCNSSA Standard for Design and Construction of Storm Shelters

więcej podobnych podstron