Unmanaged Files in Drupal (Part 1): When and Why to Use Them

Ditch the Database: Smarter File Handling for High-Volume Image Sets
Unmanaged Files in Drupal: When and Why to Use Them (Part 1)

This is the first in a six-part series about using unmanaged files in the Drupal environment to your benefit.

If you've been working with Drupal for a while, you're likely familiar with managed files. If so, feel free to jump ahead to “What is an unmanaged file?”

Managed files are files that Drupal tracks in its database. They most often find their way to Drupal via its UI, being uploaded during content creation or the submission of a form. They are typically media files, such as documents, spreadsheets, PDFs or images.

What Is a Managed File?

Data related to managed files are tracked by Drupal inside its database in the file_managed table. The structure of the table is:

FieldType
fidint(10) unsigned
uuidvarchar(128)
langcodevarchar(12)
uidint(10) unsigned
filenamevarchar(255)
urivarchar(255)
filemimevarchar(255)
filesizebigint(20) unsigned
statustinyint(4)
createdint(11)
changedint(11)

These fields provide all the meta information you would want available in order to select and process files based on various criteria.

As you can see, this metadata can be used for granular control of the file assets, such as selecting all files of the same mime type, language, etc. You might wonder how this ties into the use of media. For example, what if an image being used is defined as a media file rather than an image file? The media file is a superset of an image file—it is a managed file that is pointed to by the media wrapper, which accommodates reuse. For example, an image uploaded as a Media entity is still stored as a managed file. The Media entity is a wrapper that adds reusability and metadata, pointing back to the underlying managed file.

What Is an Unmanaged File

Simply put, an unmanaged file is a file that is not managed by Drupal, which has no record of its existence. Of course, no file stored on a server is completely unmanaged—the server's file system is aware of it, but for the purposes of a discussion in the context of Drupal, the file is not managed.

About This Tutorial

With all the advantages that come from using a managed file, why would we ever want Drupal to ignore a file? This tutorial will present one answer by way of a use case that lays out why you might want to use unmanaged files in certain circumstances and some ways to do that.

Here is what you can expect to learn throughout this series:

  • Part 1 – the specification for an example use case, an abstracted architecture to fulfill it, and putting some scaffolding in place
  • Part 2 – the file handler that allows the random selection of an “in the wild” unmanaged file
  • Part 3 – a twig-friendly variable in a block preprocess function
  • Part 4 – a reusable block plugin via a custom module
  • Part 5 – a twig extension
  • Part 6 – additional file-handling logic to implement the selection rule (no more than one image per region)

The Spec

We will ultimately create a block for a homepage, where, for our purposes, populating the block is the focus of the tutorial. Here are the specifications of the block:

  • Displays images of the map outline of three countries
  • The images will be selected randomly
  • The images will be stored segregated by regions, continental or oceanic
  • When making the random selection, no more than one image may be from the same region

The Architecture

The detailed architecture will vary somewhat between the three versions of the solution that we will cover. What differs between the three approaches is how the output from the custom module ends up being rendered on the page.

Tutorial PartApproachFile HandlingTheme CodeContainerTwig?
3Preprocess variableCustom modulePreprocess hookCustom blockY
4Block pluginCustom modulen/aBlock plugin instanceY
5Twig extensionCustom modulen/aCustom blockY

The Files

The files are outline maps of nations—228 of them. This is a good time to pause and reflect on the premise of this tutorial: sometimes unmanaged files are the best answer. If this is one of those times, then why?

To answer that, let’s consider how one would handle adding 228 images to Drupal, where they will then be managed files. There are some options:

  • Create a content type and use the upload widget on the node form, then set the category using a text list field or a taxonomy term
  • Create a hierarchical taxonomy vocabulary and use the upload widget on the term form and an add-on field to select the category from another vocabulary
  • Create a media item for each image and use a contributed module that enables identifying a category folder for each image
  • Use a contributed module for bulk upload (though it may not accommodate the category requirement)

The common denominator here is effort. Whether uploading these files individually or setting their category individually, or both, there is quite a lot of effort required. Now, if there is a requirement for the files to be part of node content, or even a benefit to be gained that justifies the effort, then so be it—go with your judgement. In our case, let’s revisit the information that would be available given a managed file and look at the value of it in the context of our specification:

  • fid – not needed for random selection
  • uuid – unnecessary, filenames are unique
  • langcode – site is not multilingual
  • filename – available even if unmanaged
  • uri – static paths will be used
  • filemime – all are JPEGs
  • filesize – not needed
  • status – all images are permanent
  • created – not relevant
  • changed – not relevant

Clearly, the file information that becomes available when it is managed is unimportant for our purposes. Were any single item of information important for our needs, the calculus changes. Agreed? Good.

You might be wondering how the requirement will be met if using Drupal and files of which it is not aware. It’s a fair question that will be answered in Part 5 of the tutorial. For now, let’s get the files into place on the server for use later.

Following is the tree on my local containing all the images:

  • segregated_maps
    • africa
    • antarctica
    • asia
    • australia
    • caribbean
    • central america
    • europe
    • mideast
    • north america
    • pacific islands
    • south america

 

As there were with managed files, there are options for putting this structure in place on the server. Here are some:

  • Use SSH to log into the server and create the folder structure, and then scp to move the files
  • Use an application like Filezilla to SFTP the folder structure and files
  • Use rsync to move the files, creating the folder structure as it does

I’m going to use option 3 and have it compress the files when transferring them, which makes it very fast. I’ll place the segregated_maps folder under public://.

rsync -avz segregated_maps myuser@myserver:/path/to/target/

Breaking it down:

  • -a – recurses folders, preserves permissions
  • -v – verbose output
  • -z – compresses during transfer
  • segregated_maps – source folder (no trailing slash preserves folder name)
  • myuser@myserver – SSH target (could be IP or domain)
  • /path/to/target – destination path

The result is that the folder tree and images are now in place at the server path.

Next Up

In Part 2, we’ll start coding and build a functional prototype of the unmanaged files handler, which we’ll complete in Part 6.

Note: If you’d like to follow along with actual code, I’ll be publishing the example module and supporting notes in my public GitHub repo, with the link included in Part 2.

Note: The vision of this web portal is to help promote news and stories around the Drupal community and promote and celebrate the people and organizations in the community. We strive to create and distribute our content based on these content policy. If you see any omission/variation on this please reach out to us at #thedroptimes channel on Drupal Slack and we will try to address the issue as best we can.

Related Organizations

Upcoming Events

Latest Opportunities