2 Data Design Attribute Types


[MUSIC] Hello everyone, and welcome back. In this video, we're going to discuss attribute data types. By the end of this video, you will be able to choose appropriate attribute data types for your data tables and rasters. You will also have foundational knowledge for using attribute data types in queries. Finally, you'll be able to troubleshoot problems related to incorrect data types. So first, what do I mean by data types? Data types help us group, store, and define capabilities for information. To start with, you can think of a few different types of data, generally speaking. We have numbers, text strings, dates and times, and some others, these are the human types that we often think of. Remember how I said data types define capabilities? Think of it this way, we can multiply two numbers, but we can't multiply two pieces of text. The result is not defined, so knowing that a value is a number versus text is important to the computer because it helps to understand the types of operations we can perform on it. These data types matter everywhere, from vector attribute tables to the type of data in a roster, and have implications for use in the field calculator, what tools we can use, map algebra, definition queries, programming, and more. Remember, also, that all data on computers goes back to binary information. Binary meaning one of two values, and, in the case of computers, that's commonly represented as 0s and 1s. What those 0s and 1s mean is up for human interpretation, which is why we come up with the rules or data types that help us specify what they mean in particular use cases. If we want to represent the number 84 in binary, we can write it as 1010100, that same binary value interpreted as text represents the letter T. So, it matters which data type we consider this information to be, or else we get the wrong data, 84 versus T. It's also worth mentioning that database storage systems can operate more quickly and efficiently when they know what type of data to expect. They can optimize the amount of storage space required and the algorithms used in processing the data. To computers, data types break down even further. When I said numbers, I really meant a whole group of data types. Let's think back to our days in school, remember the difference between whole numbers, or integers, and real numbers? Whole numbers are non-fractional, in that they representing number 0, 1, 2, 3, 4, etc., but nothing in between. As a whole number, we cannot represent 2.340. Computers don't track the number precision that way and can't represent values between whole number as integers. If you did want to represent the value of 2.340, you can instead use a real or decimal number, which has a decimal place in it, so that we can represent fractions or non-whole values. Real numbers have the precision required to track these partial values. We still divide number types even a little further, though, largely for efficiency of storage and calculation. I won't bore you with all of the details, but know that computers have a few different integer data types for dealing with whole numbers. This is widely the same across large segments of computers, but it does vary, so let's talk specifically about ArcGIS. Data formats in ArcGIS have short integers and long integers. Short integers can contain integer values between -32,768 and +32,767. You'll get used to those numbers, even though they probably seem oddly specific to you right now. It comes from how many binary bits we use to store the number. That number range allows for 65,535 possible values and requires 16 binary bits to accurately store any number within that range, regardless of which number you are storing. In contrast, a long integer uses 32 binary bits, so twice as much, but it can store values between -2 billion or so and +2 billion or so. The difference in which to choose comes entirely down to the data you need to store. We don't need to use all the space of a long integer if we're only storing degree slope, which would be values 0 to 90. We can use a short integer for that, a long integer won't harm us, but it will use more disk space. In addition to our integer numbers, we have the decimal or floating point numbers. We once again have the two number types here, traditional or single precision floating point numbers and the double precision glaring point numbers. Double precision numbers are exactly what they sound like, they can store larger, more precise decimal numbers than floating point numbers can, but they use twice the storage space. These numbers are often called floats and doubles together. Just to help you remember the word float, I'll tell you that the term floating point refers to the fact that the decimal point can metaphorically float to any point it needs to in the number based on the precision of the information we want to store. That may seem obvious that it can move to whatever point it needs to be in, but it's incredibly important when representing fractional values in binary as computers do. Floats and doubles are more computationally expensive than integers, so you only use them when you need to. A single precision float uses the same amount of space as a long integer but can store incredibly large values compared to a long integer, whether or not you're using the decimal place. It can store values between -10 to the 38th power and +10 to the 38th power. A double-precision number is larger still, storing extremely large values between -10 to the 308th power and +10 to the 308th power. These are the data types you want to use if you need to store very large numbers. Let's do just a really quick review. We can represent small, whole numbers as short integers and large, whole numbers as long integers. We can also represent real or decimal numbers as fractional values, as floating point or double-precision numbers, whether or not we need the decimal values. If we need to store incredibly large numbers, we can use floating point and decimal values. Okay, but what if we need to store some text? We use strings for that. The term string can take some getting used to, but I always like to imagine that in a text string, the letters are each individually tied to the next letter by a string. If I was to pick it up by one hand, all of the letters would dangle here on a string. Different systems have different ways of storing strings and different limitations on them, but in ArcGIS it's commonly just called text. In file-geo databases, you don't really have any limitations like you do with numbers where you need to plan ahead for the type of data you store. In other systems, you sometimes need to specify just the length of the text you want to store so that the database can set aside the appropriate amount of space. Now, some of you are thinking, what if I store a number as part of my text? Well, that's okay, the text representation of that number gets stored rather than the numerical representation. It's a version that you're not going to be doing any math on, but it still will appear to the user as the number you want to see it as in the midst of your other text. Sometimes, numbers are stored as text, when the numbers represent categories, rather than distinct numerical values. We have a few other things to talk about when we're talking about data types and data tables. The first of these is the concept of the null value, null values are just blanks, they mean that a value is not defined. It's not the same as zero, though, no matter how much we often want to make it be zero. Null means we don't actually know how much of something there is. We don't have the data, whereas zero means we know how much of something there is, and it is zero. Null values often appear in data tables that already have records in them when we add a new field but haven't yet populated our values. The database adds the new field with null values representing that there is no value for that field yet. Null values can be something to watch out for, though, because they can break things. Doing math on a record with a null value can result in null values rather than zeros, and data cleanup can then be required. The last thing I want to talk about when we're talking about fields is how to name them. Come up with a consistent scheme that you will use that helps you by using predictive names and helps you easily understand them when you see them. Be somewhat descriptive and most importantly be consistent. You also should not use spaces in your names as most databases don't support them in field names because it can make things ambiguous when running commands. I like using underscores between words myself, but some people like using a method called camel case, where the first letter of every word is capitalized, but there are still no spaces between them. It gets the camel case name because, since the first letter is capitalized, you get these humps in the words like humps on a camel's back. Okay, that's it for this lesson. In this lesson, you learned the various field and raster data types and what types of values they can store. You learned how to separate out the different types of numbers and about text strings. You also learned about the concept of the null value, and that they can create tricky situations. And lastly, you learned how to name fields. In the next lesson, we'll put some of this in practice and go more in depth on how to use attribute tables in ArcGIS. See you then.

Wyszukiwarka