Thoughts on searching: when to use a hash / when to use a db / when to use a distributed hash?

I have written a few medium scale (hundreds of gigs of data) database systems and have been working on a smaller (tens of megs) of data and in retrospect, I'm wondering if using a database like I did was overkill.  At which point do you say, "This is too much data for a hash / XML help file, I think I will go to a DB?"

There are a number of considerations here, one of the considerations I had is:

  • Where will the data be shared?
  • Where will the data be used?
  • Why are we using this mode?  Performance?  Easy of programmability?  Scalability?
  • How will the data be entered?

All in all, it depends on what you're doing and the overhead to the process.