Today I stumbled upon the paper Rethinking Antivirus: Executable Analysis in the Network Cloud. It talks about running lightweight processes on the hosts which ship files to be scanned to a network server which scans them and gives the clean/infected verdict. I had the exact same idea around the same time :-). Some benefits of this method would be:
- Performance: while modern (especially multi-core) computers can perfectly well handle desktop AV suits, a multi-engine approach is still a little too heavy weight for them.
- An other performance aspect: since the same file will be coming from multiple machines, it can be scanned once and (supposing it is clean), the other machines don't even need to send the file (just an MD5 hash for example).
- Aggressive caching can be employed both at the client (don't recalculate hashes on files which didn't change on the disk) and at the server. Of course the server needs to purge its cache whenever an AV engine is updated (since it is possible that a file which wasn't detected until now is detected). The effect of this can also be mitigated: the server should keep a copy of the most frequently submitted files and rescan them preemptively whenever an AV engine is updated (of course the rescanning is to be done only with that engine). This way these "hot" files can be placed back in the cache.
One thing I didn't see in the paper is the discussion of false positive rates, which also increases when you combine multiple engines like that.
Image taken from Fernando Arconada's photostream with permission.