Currently the Cache status table only mark whether a cache has started, but when it get's stuck, there is no option other than to mark the record as either failed or completed, and then go on. There is no information that allows you to find the problem in the logs.
Thus add the following three fields:
Node - The Node the cache job is running on
SessionID
RequestID
Optional:
ActualStartTime - Useful for performance tuning, as currently starttime is used for the starttime of the policy, and not the actual cache table itself.
The top 3 fields will allow people to quickly find the problem in the relevant log file.
Additionally, TDV can have a background job running that checks every few minutes to see if the Session/Request is actually still active, and if not update the relevant cache_status record, and mark it as failed.
This will prevent the cache from appearing to run for many hours (until the admin resets it) while in fact the TDV is sitting idle.