Strictly speaking, “unstructured data” is a contradiction in terms. Data must have structure to be comprehensible. By “unstructured data” people usually mean data with a non-tabular structure.
Tabular data is data that comes in tables. Each row corresponds to a subject, and each column corresponds to a kind of measurement. This is the easiest data to work with.
Non-tabular data could mean anything other than tabular data, but in practice it often means text, or richer media such as images and video. It could mean more constrained data types but with a graph structure rather than a tabular structure. For example, XML and JSON files have a tree structure rather than a table structure.
There’s also overlap between tabular and non-tabular data. For example DICOM images are medical images (essentially JPEG files) with tabular metadata embedded in the file. Coming from the opposite direction, relational databases often contain “blobs” (binary large objects) that are islands of unstructured data contained within structured data.
More productive discussions
My point here isn’t to quibble over language usage but to offer a constructive suggestion: say what structure data has, not what structure it doesn’t have.
Discussions about “unstructured data” are often unproductive because two people can use the term, with two different ideas of what it means, and think they’re in agreement when they’re not. Maybe an executive and a sales rep shake hands on an agreement that isn’t really an agreement.
Eventually there will have to be a discussion of what structure data actually has rather than what structure it lacks, and to what degree that structure is exploitable. Having that discussion sooner rather than later can save a lot of money.
Free text fields
One form of “unstructured” data is free text fields. These fields are not free of structure. They usually contain prose, written in a particular language, or at most in small number of languages. That’s a start. There should be more exploitable structure from context. Is the text a pathology report? A Facebook status? A legal opinion?
Clients will ask how to de-identify free text fields. You can’t. If the text is truly free, it could be anything, by definition. But if there’s some known structure, then maybe there’s some practical way to anonymize the data, especially if there’s some tolerance for error.
For example, a program may search for and mask probable names. Such a program would find “Elizabeth” but might fail to find “the queen.” Since there are only a couple queens , this would be a privacy breech. Such software would also have false positives, such as masking the name of the ocean liner Queen Elizabeth 2. 
 The Wikipedia list of current sovereign monarchs lists only two women, Queen Elizabeth II of the UK and Queen Margrethe II of Denmark.
 The ship, also known as QE2, is Queen Elizabeth 2, while the monarch is Queen Elizabeth II.