File Content/Extension Identification

MarkSanchez

New member
Joined
Nov 18, 2020
Messages
3
Programming Experience
10+
We are looking to add an extra level of security to enhance our receiving of files through our 3rd-party file transfer system. We would like to detect cases where a sender (maliciously or otherwise) uploads an EXE file (which is a blocked extension) by renaming it with a JPG extension (An allowed file extension). Our system does not do this type of validation, allowing the spoofed file to be uploaded.

Does anyone have any recommendations for this type of file content validation/identification? We are C# developers, so we would prefer Windows SDK's or libraries that do this.
 
Why are you looking at the file extension? That's only naming convention. For example, a Screen Saver program is just a .EXE with a .SCR file extension.

The proof of the pudding is in filling. Look at the actual file contents to determine if it a .EXE file. All Windows executables have a distinctive header format otherwise the OS will not execute them.
 
Thank you but I mentioned file "content" validation/identification, not extension. I only used extension as an example scenario for what we wish to guard against. We would like to check every file (regardless of its listed extension) and check its contents (header) to identify it and validate if it is what it says it is.

Does anyone have any recommendations for this type of file content validation/identification?
 
Yes, read the first 512 bytes (or was it 256 bytes) and look for the signature at the beginning of the block as well as some other earmarks.

 
Thank you. Our 3rd-party appliance does restrict about 50 file extensions (including those in the link above), but as mentioned, it can be fooled by renaming. If we wrote something ourselves, we could identify files such that if they are any of these excluded types, we could delete the file, send an alert, etc. I will run this type of scenario by my boss.

I believe he was looking for an inclusive method (one that identifies "all" files) rather than an exclusive one (only identify known and potentially dangerous files). Also, we'd prefer something that is ready-made, and continually updated. We have identified a couple of such tools but have yet to test them out. We just wanted to seeif anyone here has experience doing this type of file identification and what tools (if any) that they used.

Thanks
 
Read the wikipedia article closer -- it is actually about the file contents. The list file extensions there was just examples of files that typically have that kind of content.

If you are planning on doing a whitelist/accept list rather than a blacklist/exclude list, then simply assemble your list of acceptable file types and look at the file header signatures of those acceptable files.

Your original question was about how to identify executable files, and that is why I was pointing you towards the PE file format -- that will let you identify executable files by their contents.
 
Back
Top Bottom