Exercise - Add attributes

So you have some data to work with, let's load a database of facial images. The dataset is a publicly available one called the Olivetti Faces dataset. The dataset originally was created by AT&T.

  1. Let's create a new file for this class. Open the Cloud Shell editor and create a new file by using the following command:

    code MissingPersons.py
    
  2. One of the many popular packages available for Python is scikit-learn, an open-source library that's used to build machine learning models. scikit-learn includes several built-in datasets, one of which is the Olivetti faces dataset.

    Paste the following statements into your new file to load the faces dataset:

    from sklearn.datasets import fetch_olivetti_faces
    
    # Load the dataset
    faces = fetch_olivetti_faces()
    
    # Prove that the dataset was loaded
    print(faces.data.shape)
    

    The first line imports the scikit-learn function that loads the dataset. The second line loads the dataset. The third line shows the shape of the dataset.

  3. Before we can run the code, we must ensure that the sklearn Python package is installed. Run the following command to install it in Cloud Shell:

    pip install sklearn
    

    It might take a few minutes to install. Watch the output of Cloud Shell to see when it finishes.

  4. When it finishes, use the following command to run your code (make sure you've saved the new file):

    python3 MissingPersons.py
    

    Examine the output. The dataset contains 400 faces, each of which consists of an image with 4,096 pixels. The dataset contains 10 photos each of 40 different people. The first 10 images in faces.images represent the first person, the next 10 images represent the second person, and so on.

    Curious about what the dataset looks like? Here are the first five people in the Olivetti dataset:

    A picture of the first five people in the Olivetti dataset

    The first five people in the Olivetti dataset

Now that we have some faces to work with, let's shift our thinking to objects, classes, and attributes.

Define a class that contains instance attributes

Instance attributes differ from one class instance (object) to another. You can't access instance attributes without a class instance like you can with class attributes. Instead, you must create an instance of the class.

Python provides multiple ways to create instance attributes, but the most common is to define an __init__() method that contains the attributes you want objects to have.

  1. Add the following code to the bottom of your file to define a Person class that contains three instance attributes:

    class Person:
        def __init__(self, name, photo, date_of_birth):
            self.name = name
            self.photo = photo
            self.dob = date_of_birth
    

    The self keyword refers to the object instance and is provided in the first argument to __init__(). Inside __init__() are three instance attributes that can be accessed on Person objects:

    • name, which holds the person's name
    • photo, which holds an image of the person's face
    • dob, which holds the person's date of birth

    Three arguments—name, photo, and date_of_birth—must be provided when the object is created (the self argument is provided by Python itself). Each argument is copied into the corresponding instance attribute.

  2. Let's test these attributes. Add the following line of code at the top of the file, right below the existing import statement:

    import datetime
    
  3. Now, add the following code at the bottom of the file to create an instance of Person named aPerson that has the name "Adam" and is assigned the first face in the Olivetti dataset:

    aPerson = Person("Adam", faces.images[0], datetime.datetime(1990, 9, 16))
    
  4. Now, add this statement to display Adam's name:

    print(aPerson.name)
    

    Feel free to remove the previous print line and associated comment (print faces.data.shape) to clean up your code.

  5. Save your file, and then run it in Cloud Shell:

    python3 MissingPersons.py
    

    Your code should output Adam.

Because name, photo, and dob are instance attributes, you could create hundreds of Person objects. Each could hold a different name, photo, and date of birth. If these were class attributes instead, name, photo, and dob would have to be the same for every person—clearly not a model of what happens in the real world.

Data hiding

Many programming languages that support OOP also support data hiding by allowing methods and attributes—"class members"—to be declared private or protected. Private class members can be accessed from inside an object but not from outside the object. Protected class members can be accessed inside an object and inside objects that are subclassed from it (more on this later), but not from outside.

Python doesn't support data hiding—at least not in the same sense that other languages do. Guido van Rossum, the creator of Python, felt that data hiding makes languages harder to use. Consequently, you can't hide class members in Python.

But you can use well-established conventions to let others know that certain class members are for internal use only and should not be accessed from the outside. Prefacing a class member name with an underscore, as in _myProtectedVar, indicates that the class member is protected. Using two underscores (for example, __cleanup()) indicates that the class member is private.

Although you can still write code to access private and protected methods and attributes from the outside, many Python programming environments honor these conventions and hide private and protected members from view. So, Python does support a limited form of data hiding, but only by convention. You need to be aware of the conventions when you send your code to the world.