Odd missing TS packet problem on my HDHR3-CC (seems resolved)

Reception, channel detection, network issues, CableCARD setup, etc.
Post Reply
djp952
Posts: 1088
Joined: Wed Oct 01, 2008 8:46 pm
Device ID: 131EB7F7;131ED0E0
Location: Elkridge, MD

Odd missing TS packet problem on my HDHR3-CC (seems resolved)

Post by djp952 » Wed Jan 29, 2020 4:30 pm

pre-edit: Now that I examine some other artifacts I have more closely, I think I'm wrong and this is something with just the RECORD engine. The statement below about "every hour on the hour" only seems to apply to things coming from RECORD, either live or recorded. Regardless, leaving this here for feedback :) Please feel free to edit/delete/move to a better place like the software area if you think it's RECORD as well. Thanks!

I am having a weird problem with a couple channels, most easily observable with TNT-HD. Every hour on the hour, and I mean precisely on the hour, the stream will drop out. The stream isn't corrupted, it's just missing TS packets:

Code: Select all

tsfixcc -n d:\mummyreturns_tnt.mpg
packet index: 14,350,826, PID: 0x177B, missing 1 packets
packet index: 14,350,829, PID: 0x177B, missing 3 packets
packet index: 14,350,830, PID: 0x177B, missing 9 packets
packet index: 14,350,831, PID: 0x177B, missing 5 packets
packet index: 14,350,832, PID: 0x177B, missing 8 packets
packet index: 14,350,833, PID: 0x177B, missing 1 packets
packet index: 14,350,834, PID: 0x177C, missing 7 packets
packet index: 14,350,849, PID: 0x177D, missing 2 packets
packet index: 14,350,853, PID: 0x177F, missing 2 packets
packet index: 14,351,273, PID: 0x177A, missing 6 packets
packet index: 14,351,416, PID: 0x0000, missing 11 packets
The weird part is that it can occur one one tuner but not another on the same device. For example, I've been recording "The Mummy Returns" on TNT via tuner 0 and streamed it manually via cURL from tuner 2 at the same time. The stream from tuner 2 has the packet loss printed above, but the recorded file does not, there are no continuity errors at all in it. However, I've also found the opposite to be true, where the manually streamed data is perfect and the recorded stream is flawed.

Given the "this can't be random" timing of when this happens, and the fact that I can get different results from different tuners on the same device, I'm reaching out for some help :) Is this perhaps an early sign of a hardware failure or just something that's normal, if out of the ordinary?

I turned on diagnostics on device 131EB7F7 (HDHR3-CC, Firmware 20200121beta1), and it's still recording TNT to the RECORD engine on tuner 0. I have two cURL instances pulling the same channel from tuner 1 and tuner 2 and
are set to run for an hour each. So there should be some data to look at for that device across three tuners for the 7:00PM EST time frame (when the clock gets there, of course, it's only 6:30PM now).

edit: test results

Tuner 0 stream (recorded) is now showing 2 instances of the missing packets, that stream has been running since 5:30PM. I don't believe either instance was present before I enabled diagnostics, but I can't guarantee that.
Tuner 1 stream (live) showed 1 instance of the missing packets, should be in diagnostics.
Tuner 2 stream (live) is perfectly fine -- again, this is the from the same channel as Tuner 0 and Tuner 1 are/were streaming.

Tuner 1 and Tuner 2 streams were started within a couple seconds of each other, but the instance on Tuner 1 is very close to the beginning of the file (around packet 233,000, or around 43MB into the stream), there is no way that happened at 7:00PM.

I should be able to set something up to let a couple tuners just run all night with diagnostics on if that helps.

Thanks in advance for any insight!
Last edited by djp952 on Wed Feb 12, 2020 8:06 pm, edited 1 time in total.

Online
nickk
Silicondust
Posts: 15708
Joined: Tue Jan 13, 2004 9:39 am

Re: Odd missing TS packet problem on my HDHR3-CC

Post by nickk » Wed Jan 29, 2020 11:01 pm

The HDHR3-CC is receiving data fine.

Whatever the RECORD engine is running on sometimes has significant delays writing to disk... log shows one glitch where it took 5.5s to write 128k of data to disk.

What hardware are you using for recordings?

djp952
Posts: 1088
Joined: Wed Oct 01, 2008 8:46 pm
Device ID: 131EB7F7;131ED0E0
Location: Elkridge, MD

Re: Odd missing TS packet problem on my HDHR3-CC

Post by djp952 » Thu Jan 30, 2020 11:35 am

nickk wrote:
Wed Jan 29, 2020 11:01 pm
The HDHR3-CC is receiving data fine.

Whatever the RECORD engine is running on sometimes has significant delays writing to disk... log shows one glitch where it took 5.5s to write 128k of data to disk.

What hardware are you using for recordings?
It's a WD EX4100. I wonder if it's getting CPU bound or something. I have noted it takes a long time for RECORD to respond to events like "delete" from time to time.

I'll have a look at it tonight and make sure there isn't something unexpected on there and start poking around on WD's forums for anything similar.

Thanks!!!!

djp952
Posts: 1088
Joined: Wed Oct 01, 2008 8:46 pm
Device ID: 131EB7F7;131ED0E0
Location: Elkridge, MD

Re: Odd missing TS packet problem on my HDHR3-CC

Post by djp952 » Thu Jan 30, 2020 8:54 pm

Hi Nick, thanks again for the response and diagnostics as always. I found a possible culprit on the EX4100. I had a MySQL database configured on the device that was acting as a shared media library database for multiple Kodi instances around the house. This was causing an unanticipated spike in CPU on the device, ramping up to nearly 90% CPU at times.

I see no concerns with I/O performance with the device, both with actual tests and some monitoring with iostat. Read and write performance over the network (via SMB) is stellar (>880Mb/s in both directions). All S.M.A.R.T. data looks good as does all on-device diagnostics. I've disconnected everything from, and removed, the MySQL database and am just waiting for a recording to finish before I bounce the EX4100.

I'll let you know (in the software area) if I continue to see any missing TS packets in the streams originating from RECORD that aren't missing from stream originating from a tuner. Other than the CPU spike I see nothing wrong with the device or the network on this end, and you helped alleviate my concerns about a possible tuner malfunction :)

Side question: Do you know of any relatively easy way to match up disparate TS files by timestamp? If I still see the problem I would love to be able to help out by comparing the "good" stream against the one missing packets to see if the source data might be sending something that is tripping up RECORD. It still concerns me a bit that this only happens on specific channels and at extremely specific times.

djp952
Posts: 1088
Joined: Wed Oct 01, 2008 8:46 pm
Device ID: 131EB7F7;131ED0E0
Location: Elkridge, MD

Re: Odd missing TS packet problem on my HDHR3-CC (seems resolved)

Post by djp952 » Wed Feb 12, 2020 8:36 pm

For purposes of anyone that has a similar problem and happens upon this, it does appear that this was a drive problem in the NAS after all. A full diagnostic of all disks came up with one bad disk (SMART 200 - Multi zone error rate), which after replacement did not solve the problem, but with that out of the way I was able to isolate another disk that was performing poorly. I saved a 24-hour long atop report and found that in aggregate one disk was showing higher utilization levels and worse I/O times than all the others. After replacing THIS drive things are finally looking up! The second RAID array rebuild from getting rid of this drive dropped 4 entire hours, too (16 hours down to 12! [~16TB of data]).

Some notes from my research on the drives themselves:
  • The 2014/2015 model Western Digital 6TB Red drives appear to have a relatively high failure rate compared to most other WD Reds (source: backblaze.com research papers)
  • My drives were all closing in on the 40,000 hour mark (~4.5 years of service), if you're getting close you may want to run a full diagnostic, the "quick" diagnostics don't really do anything
  • My drives were running between 36C and 37C, this is high compared to friends with the same disks in non-WD enclosures (they get temps more like 28C-29C), could be a factor for lifetime
  • Believe nickk if he says you have a hardware issue :)
I'm replacing mine with 10TB Seagate Ironwolf drives, FWIW. According to backblaze.com's research they are extremely reliable, and only cost a smidge more than 10TB WD Reds (still a boatload of cash, though). In practice so far, they are coming up as notably faster for write operations but a bit slower on read operations. They run cooler than the 6TB Red drives as well, I'm seeing temps of 32-33C as opposed to 36-37C temps in the same bay(s). Not that it matters but they are also a thing of beauty to look at, prettiest hard drives I've ever seen.

So if you are having some EX4100 problems and stumble on this thread, my advice is to run the full disk diagnostic and deal with anything it says you need to deal with, but also be cognizant that a "good" drive may still be nearing the end of it's life and actually be the real underlying cause for your performance problems.

Post Reply