From cfogg@chromatic.com Newsgroups: sci.engr.television.advanced,alt.video.dvd,rec.video,comp.compression,alt.video.laserdisc Subject: A Day at the DVD Forum: technical notes. Organization: Internet Date: Mon Apr 22 12:43:29 MET DST 1996 Title: DVD Notes from DVD Forum Draft: 1.0 From: C. Fogg Date: April 11, 1996 These are my notes from attending the DVD Forum held at the Westin Santa Clara hotel in Santa Clara, California April 10 and 11, 1996. Some facts are summarized from the booklet distributed at the meeting. Additional data has been added from verbal presentations and my own preconceptions. HTML version some day, perhaps with illustrations... WARNING: This document has been slapped together so kindgly forgive any errors (read: you're lucking you're getting this in the first place :-) Number one question: how do I get the DVD spec ? ================================================= At the time of this forum, a new spec was just being drafted. This should delay any distribution for a few weeks. Currently, it costs $5,000 USD (U.S. Dollars) to join the DVD Consortium, which has been the only way to obtain a legal copy of the spec. This policy is expected to change soon. But note that today it still costs real money to obtain copies of the old CD books (Blue, White, Green, Red, etc.) The DVD Consortium also wishes to improve the "dearth" of technical information on the World Web Web/Internet. Forum ===== The Forum was hosted by Matsushita, Mitsubishi, Philips, Pioneer, Sony, Thomson, Time Warner, Toshiba, and JVC. The General Session ran during the morning hours. The Hardware/Component, Media Software, and Physical breakout sessions ran as parallel tracks in the afternoon. The general session started with overviews of the DVD books and ended with a panel of Hollywood home video studio executives, followed by another panel of game developers. Most of the audience (~500 people each day) originated from the hardware sector (semiconductor, board vendors, etc.) rather than the software sector (game producers, studios, post-production houses). ========================================================== BOOKS OVERVIEW ========================================================== The there applications of DVD are Video, Audio, and ROM. All three discs types are built upon the same physical specification and file structure. Part 3 Video Audio Application Specs Specs Part 2 UDF UDF File system UDF-Bridge (M-UDF + ISO 9660) Format Format Part 1 Physical Physical Physical Physical format Format Format Disc specifications Disc specs Disc specs (Read-Only Disc) Write-once Rewritable Book Book A Book B Book C Book D Book E Read Only Video Audio Write-Once Rewritable Specs Specs Specs Specs Specs Notes: - The DVD Audio (Book C) is expected in Summer 1996. - A blue laser version of the physical books (A, D, E) is expected by the year 2000. Structure of Video Book (Book B): Physical part =============================================== 1. General - scope - general parameters - normative reference - notations - terminologies - abbreviations 2. Disc specifications - disc outline - environmental conditions - measuring conditions - mechanical parameters - optical parameters - recorded parameters - operation signals 3. Information Area Format - track structure - sector structure - modulation method (8/16 Modulation) - Lead-in, Middle and Lead-out Area Structure of Video Book (Book B): File System Part =============================================== 1. General - scope - Normative Reference - Definitions - Notations - Data types of descriptor field 2. Volume Structure - Requirements for DVD-ROM disc - Volume Space - Volume Structure of UDF Bridge Format - UDF Bridge Volume Recognition Sequence - Anchor Point - Volume Descriptor Sequence - Logical Volume Integrity Sequence - CD-ROM Volume Descriptor Set 3. File Structure - Requirements for DVD-ROM disc - UDF File Structure - UDF File Set Descriptor Sequence - UDF Directories - ICB - ISO 9660 Directory Structure and Path Table Structure of Video Book (Book B): Application Part =============================================== 1. General - Scope - General Specifications of Presentation Data - Normative Reference 2. Technical Elements - Definitions - Symbols - Notations - Terminology - Abbreviations 3. Introduction - Logical Structure of DVD Video - Presentation Structure - DVD System Model 4. Navigation Data Structure - Video Manager Information (VMGI) - Video Title Set Information (VTSI) - Program Chain Information (PGCI) - Presentation Control Information (PCI) - Data Search Information (DSI) - Navigation Commands and Navigation Parameters 5. Video Object (VOB) - Contents of VOBs - Pack - Player Reference Model - Presentation Data ========================================================== APPLICATION ========================================================== [Data provided by JVC and Thomson] DVD Presentation Data (summary) =============================== Type Count Representation ------------- --------------------- -------------------------- Video 1 stream only MPEG-1 or MPEG-2 Video Audio maximum of 8 streams Linear PCM and/or: Dolby AC-3 (NTSC) MPEG audio (PAL) Sub-picture max 32 streams Run-length encoded with bitmap of 2 bits/pixel (Specific) relation of other standards ====================================== Video ITU-T H.262/ISO-IEC 13818-2 (MPEG-2 Video) ISO/IEC 11172-2 (MPEG-1 Video) Audio ISO/IEC 13818-3 (MPEG-2 Audio) ISO/IEC 11172-3 (MPEG-1 Audio) Dolby AC-3 standard System ITU-T H.222 / ISO/IEC 13818-1 (MPEG-2 Systems) Program/PES stream only (no Transport streams) Restriction on transfer rate ============================ max total of combined audio and video: 9.8 Mbit/sec max sum of Elementary streams + systems overhead: 10.08 Mbit/sec. Video Data Specifications ========================= DVD adds many additional restrictions to the popular compliance parameter sets of MPEG. One good example is the restriction on the coded size of a picture: MPEG-2 Main Profile @ Main Level allows any coded frame size between 16 and 720 pixels horizontally and 16 and 576 pixels vertically. However, DVD restricts the coded size to a very limited, but practical, subset. In MPEG, audio can be coded at a sample rate of 32, 44.1 or 48 kHz. In DVD, the rates of both Dolby AC-3 and MPEG audio are strictly 48 kHz. MPEG is a generic representation meant for a wide variety of applications. DVD has taken a practical subset to promote interoperability by simplifying implementations and insuring features (such as random accessibility). Coded representation: MPEG-1 (SIF combo) MPEG-2 (Main Profile @ Main Level) Frame rate: 29.97 or 25 Hz TV system: 525/60 or 625/50 Aspect ratio: 4:3 (all video formats) 16:9 (all formats except 352 pixels/line) Display Mode: pan & scan, letterbox User_data: closed caption Coded frame sizes: 525/60: 720x480, 704x480, 352x480, 352x240 625/50: 720x576, 704x576, 352x576, 352x288 (MPEG-1 is allowed only in 352x240 or 352x288 res). GOP size: max 36 fields or 18 frames (NTSC) max 30 fields or 15 frames (PAL) Maximum distance 3 (i.e. IBBPBBPBBP...) between reference frames Buffer size: 1.8535008 Mbits (MPEG-2) max 327689 bits (MPEG-1) Transfer method: VBR, CBR (MPEG-2), only CBR for MPEG-1 Maximum bitrate: 9.8 Mbit/sec Low_delay NOT permitted !!!! Notes [my reflections]: - the frame rate is the intended display frame rate. The number of coded frames in a sequence may vary due to 3:2 pulldown (the DVD MPEG decoder performs this function). The permitted values in DVD are more restrictive than MPEG-2 MP@ML which includes 23.976, 24, and 30 frames/sec rates. - aspect ratio is the display aspect ratio. Only 16:9 and 4:3 are permitted. Note: MP@ML's 2.21:1 is not included. - MP@ML has no GOP size restriction. In fact, the GOP() is considered to be an insignificant layer in MPEG-2. Instead the sequence() layer serves as the most important boundary. - The M<=3 (reference frame distance) restriction is additional over MP@ML. This distance is arbitrary, in the general MPEG syntax and currently defined Profiles and Levels combinations. - The MPEG-1 and MPEG-2 vbv_buffer_size limits are the same as MP@ML and Constrained Parameters Bitstreams, respectively. - The maximum bitrate of 9.8 Mbit/sec is more restrictive than MP@ML's 15 Mbit/sec limit. However, the point of diminishing returns (no visual difference between original video and compressed video) is reached by 9 Mbit/sec anyway. - user_data() fields in MPEG video picture headers contain closed captioning (similar to Grand Alliance and DVB methods). See this site for more information: http://www.atsc.org/ - For picture sizes, only a very limited set of coded dimensions are legal. - Variable bit rate is permitted only in MPEG-2 streams since the VBV model in MPEG-2 has provisions for it. - contrary to popular belief: all DVD players are required to decode video streams up to 9.8 Mbit/sec for indefinite periods of time. The popular average rate of 3.5 Mbit/sec or 4.7 Mbit/sec is merely canonical figures. ALL DVD PLAYERS MUST SUSTAIN A 9.8 MBIT/SEC VIDEO DECODE RATE!!!!!!! MPEG Display Formats ==================== MPEG-2 video decoder chips have implemented pan & scan for a few years already since it has been a requirement for cable TV and direct broadcast satellite. The letterbox (vertical filter) requirement is a relatively new addition. The DVD generation of MPEG-2 video decoders will probably also perform sub-picture reconstruction. Display Aspect Ratio 4:3 16:9 4:3 No conversion horizontal filtering Source accomplished by TV monitor. Aspect Ratio 16:9 letterbox No conversion (vertical filter) - or - Pan & Scan Note: Letterbox Conversion is a mandatory feature in the DVD Player !!! Subpictures =============================== - run-length compressed bitmaps that are overlayed ontop of the MPEG reconstructed video. - Applications include: Menus, sub-titles, karaoke, and simple animation. - Pixels are divided into four types: 1. background 2. Foregound 3. Empahsis-1 4. Emphasis-2 - 4 colors out of 16 color palette (4 colors are determined once per PGC). - 4 out of 16 contrast values - up to a maximum of 32 sub-picture bitstreams. Each subpicture stream could, for example, could contain text from a particular language. - subpicture buffer size is restricted to 62 Kbytes. This means a maximum of 62 KB per GOP/cell. 32 Kbytes of this is control data. - Maximum number of bits per run-length coded line is 1440 bits. - Display area maximum: 720x480 (525/60) and 720x576 (625/50) - area, content, color, and contrast can be changed every video field - Sub-Picture Display Control Sequences (SP_DCSQ) control the presentation of Sub-pictures. - Presentation effects include: scroll up/down, fade in/out, etc. Structure of Sub-picture Decoding Unit (SPU): [ SPUH ][ PXD ][ DCSQT ] SPUH: Sub-picture Unit Header (size of SPU, start address of DCSQT) PXD: Pixel Data (variable length run-length coded) DCSQT: Display Control Sequence Table (one or more display control command sequences). DCSQT: [DCSQ 0][DCSQ 1][DCSQ 2] ... [DCSQ n] DCSQ: [Start time] [ Pointer to next DCSQ] [Command Sequence] Command Sequence: [DCC 0][DCC 1]... [DCC m] Display Control Commands (DCC): - Set start address in PXD - Set colors - Set contrast - Set SP screen position - Start/stop display - Set CHG_COLCON areas. VBI Decoding ============ The Vertical Blanking Interval (VBI) packet (multiplexed at the Cell level along with Navigation, Video, and Audio packets) contains information which is directly inserted into the reconstructed video signal, sans level adjustments (16 levels into a, e.g. 256 level video signal). - only 1 VBI channel per program (sub-pictures have up to 32) - Line range is from 10 to 23 NTSC and 6 to 23.5 for PAL. - Separate palette (16 Y values, Cr=Cb=128) from subpictures. - No highlight - Restricted DCSQ command set VBI information is losslessly represented as a waveform, and coded into packets. A far more bandwidth efficient alternative is to transmit the source character stream in the MPEG video user_data() field, and then have the NTSC/PAL modulator chip create the VBI signal from the character stream. This brings our tally of closed caption representations to THREE ways!! 1. as packets of 16-level sampled VBI waveforms. 2. as user_data() character streams. 3. as rendered subpictures. Picture Size Conversion ======================= All DVD players are required to have built-in vertical filters which scale a 16:9 coded video image onto a 4:3 display. This player feature is needed since it is anticipated that a majority of movies will be coded for the 16:9 aspect ratio, while at the same time most TV displays (in the early years) will be 4:3. 525/60 (NTSC-rate display): (Note: 480*(4/3)/(16/9) = 480*0.75 = 360) _____________________ | 60 | |---------------------| | | | 360 | 480 lines total | | |---------------------| | 60 | ----------------------- 625/50 (PAL-rate display): _____________________ | 72 | |---------------------| | | | 432 | 576 lines total | | |---------------------| | 72 | ----------------------- A simple bi-linear vertical filter can be applied, yielding good visual results. Here, two source samples (s[n],s[n+1]) are weighted by simple complementary factors and added together to form the destination sample value (d[m]). These weights are easily implemented with shifters. For interlaced displays, vertical filtering occurs only within the same field parity. d[0] = (3/4)*s[0] + (1/4)*s[1] d[1] = (1/2)*s[1] + (1/2)*s[2] d[2] = (1/4)*s[2] + (3/4)*s[3] Audio data specifications ========================= Linear PCM Dolby AC-3 MPEG-2 audio Sampling frequency 48 or 96 kHz 48 kHz 48 kHz Number of bits per sample 16/20/24 compressed compressed (16 bits) (16 bits) Max transfer rate 6.144 Mbit/sec 448 kbits/sec 640 kbits/sec Max Number of channels 8 5.1 5.1 or 7.1 NTSC PAL Mandatory Dolby AC-3 MPEG-2 audio and/or Linear PCM and/or Linear PCM Optional MPEG-2 Audio Dolby AC-3 ============================================================= Philips' provided three practical scenarios for audio. Case 1: One mono language channel to be mixed with the Center multichannel set. Use Channels kbits/sec Multichannel music & effects 5.1 or 7.1 384 Mono English dialogue 1 64 Mono French dialogue 1 64 Mono German dialogue 1 64 Case 2: One of the stereo lingual signals mixed with the L & R channel of the playback multichannel set. Multichannel music & effects 5.1 or 7.1 384 Mono English dialogue 2 128 Mono French dialogue 2 128 Mono German dialogue 2 128 Case 3: One to be selected for playback. Multichannel with English dialogue 5.1 or 7.1 384 Multichannel with French dialogue 5.1 or 7.1 384 Multichannel with German dialogue 5.1 or 7.1 384 Audio Signal Decoding System ============================= General ======= - up to a maximum of 8 audio streams can be multiplexed into the same cell with a single video stream. Each stream for example is designated for a particular language or special effects & music tracks. - Dolby AC-3 used mandatory for 525/60 (NTSC) players and MPEG-2 is mandatory for 625/50 (PAL) players, but optional on discs themselves. - LPCM (Linear Pulse Code Modulated) is mandatory for all players, but optional on discs themselves. - 48 kHz and 96 kHz uncompressed PCM audio - High Definition Audio Experience - A 525/60 disc must contain either Dolby AC-3 or LPCM. A 625/50 disc must contain either MPEG-2 audio or LPCM. Due to bandwidth efficiency, most titles will use the more compact Dolby AC-3 or MPEG-2 audio. - Extendibility is reserved for new algorithms such as DTS, Sony SDDS, et al. - IEC-958 Digital Audio Interface for external decoder/receiver. Output types: compressed AC-3 or MPEG stream, two channel LPCM. DVD players are required only to output a full reconstruction of the Left and Right channels. An external AC-3 decoder would optionally decode all 5.1 channels. A more expensive DVD player would output all 5.1 reconstructed channels. Dolby AC-3 parameters ==================== Sampling frequency: 48 kHz bitrate: 64 kbits/sec to 448 kbits/sec per stream Audio coding mode 1/0, 2/0, 3/0, 2/1, 2/2, 3/1, and 3/2 (acmod) Characteristics: - dialog normalization - dynamic range compression - downmixing (5.1 -> 2 channel) capability - Dolby Pro-Logic Encoding (5.1 -> 2 channel) - Karaoke mode (voice overlay) MPEG Audio parameters ===================== Sampling frequency: 48 kHz only MPEG-1: Layer II only Mono (32 to 192 kb/s) and Stereo (64 to 384 kb/s) MPEG-2: - main stream (same as MPEG-1) - extension stream (up to 528 kbit/sec) - sum of main and extension stream up to 912 kb/s - unmatrix mode excluded (always MPEG-1 compatible) LPCM Coding =========== - Lossless/uncompressed PCM audio - Sampling frequency: either 48 kHz or 96 kHz - bits/sample: 16, 20, or 24 bits - up to 8 PCM channels. Due to the user rate bandwidth limitation of 6.144 Mbit/sec for LPCM audio, not all combinations of channel count, sample precision and sample rates are permitted: Sample Sample Channel Count Rate Prec. 48 kHz Mono 2 CH 5 CH 8 CH 16 Yes Yes Yes Yes 20 Yes Yes Yes No 24 Yes Yes Yes No 96 kHz 16 Yes Yes No No 20 Yes Yes No No 24 Yes Yes No No =============================================================== [data from Mitsubishi] DVD Feature Functions ===================== 1. Multiple titles on one disc 2. Seamless playback transitions 2.1 multiple versions of language credits 2.2 director's cut (Parental lock) 2.3 multiple version based on camera angles 3. Multi-Language System (audio, closed caption, et al) 4. Navigation System 5. Multi Screen Aspect Ratio (16:9, 4:3, letterboxed, pan&scan) 6. Multi Sound system (5.1 or 7.1 channels) maximum program_mux_rate: 10.08 Mbit/sec Audio streams are multiplexed with video. So, alternative audio tracts beyond the limitation of 8 must be multiplexed with a different video stream. Source: Pioneer Title: Interactive functions ============================== Basic User interface: - Control: ten keys and cursor keys - Display: menu graphics and high-light GUI Display: - Menu picture with subpicture and MPEG graphics - highlighted area Menu: Basic 1. Title A 2. Title B 3. Title C 4. Previous 5. next Multi-page Menu 1. Title A 4. Title D 7. Title G 2. Title B 5. Title E 8. Title H 3. Title C 6. Title F 9. Title I Exit Next Prev Exit Next Prev Exit Interactivity ============= Level of functionality 1. simply play 2. interactivity similar to Video-CD 3. Interactivity simular to PC Applications Functions ========= Information Control - parental control - copy management Menu - Title: sub-picture - Root: Angle - Audio: part of title Search functions: - program search - time search - angle search - part of title search Seamless play function Still picture function File Structure Hierarchy ========================= The DVD is broken into two separate types of information: Navigation Data (control) and Presentation (object) data. Control data acts as pointers (like an operating system's File Allocation Table) to the actual video and audio object data on the disc. Control data can be expressed as a series of nested layers: Title distinguishes multiple movies or TV episodes on one disc. Each title is one of two types: a single program chain (One_Sequential_PGC_Title) or a collection of different program chains (Multi_PGC_Title). Program Chain A collection of programs with a particular theme in common. Part_of_Title Links to one or more Program (PG) units on the disc. Like PGC, this mechanism can be used to create different versions (camera angle, ratings, outcomes, etc.) of the same program chain. POTs can also be used to mark scenes. Program Usually a scene. Consists of multiple cells. Cell Preceded by a navigation packet, and alternating video and audio packets. A cell is typically all the video and audio data associated with an integer number of a group of pictures. VOBU Video Object Unit: "typically" a group of pictures (GOP) GOP 1. smallest granularity of random access on disc (Group of pictures being with a coded Intra frame) 2. largest interframe dependent coding unit. (Interframe compression is bounded within a GOP) Usually 15 coded frames of data (0.5 seconds display duration). Packet DVD packets are 2048 bytes (sector payload size) large. As per MPEG-2 PES/Program streams, they contain data from only one data type (video, audio, etc.) NAV packet contains the optional Buttom-Command defining the playback behaviour of the cell. 1. Logical structure of Video Manager and Video Title Set [notes from Hitachi] ========================================================= A DVD may contain up to 99 different titles, each with an initial Navigation Menu allowing the user to select among different versions of the title. The root menu which branches to all titles on the disc originates with the Video Manager. Each title is organized as a Video Tile Set (VTS). DVD: [VM][VTS #1][VTS #2] ..... [VTS #n] where n<=99 The VM's VMGI includes: Attributes for the Menu, Tile Search Pointers, and the PCGI for the Menu. VM: [VMGI][VOBS for Menu][Back up for VMGI] The Control Data (VTSI) for the title (VTS) includes: attributes for Menu, Attributes for Title, Part of Title Search Pointer, Time Map Table, PGCI for Menu, and PGCI for Title. The Video Objects (VOBS) contain the actual program chains, Part_of_Tiles, programs, and so forth. VTS: [VTSI][VOBS for Menu][VOBS for Title][Back up for VTSI] Legend: VM Video Manager: sets up menus for a series of titles (1 through n) VTS Video Tile Set: a collection of video objects. VMGI Video Manager Information: VOBS Video Object Set PGCI Program Chain Information Structure of Title ================== A title begins with the entry program chain (Entry PGC). It can branch to a single program chain (One_Sequential_PGC_Title) or multiple program chains (Multi_PGC_Title). The location of the branch is determined by the link condition. Structure of a Program Chain (PGC) ================================== The program chain is broken into two separate entities: - program control information (PGCI) - video object (VOB) The PGCI defines the playback order of Programs by acting as a table of addresses which point to the sector locations of the program cells on the DVD. A program cell is essentially a group of pictures (GOP), spanning multiple sectors, and contains the actual interleaved packets of compressed bits for video and audio data. Part_of_Title (PTT) =================== The Part_of_Title divides a title in a maximum of 99 different pieces. The intent of the PTT is aid in the construction of multiple versions of the same title. One_Sequential_PCG_Title: The Part_of_Title and Program numbers are synchronized. [ PTT #1 | PTT #2 | .... | PTT #n ] Part_of_Title [ [PG #1] | [PG #2] | .... | [PG #n] ] Program Chain (PGC) Multi_PGC_Title: branch PTT #2 --> [PG #1] (PGC1) PTT #3 PTT #m PTT #1 --> [PG #1] [PG #j] ... [PG #k] (PGC2) [PG #1] --> [PG #1] (PGC3) Presentation of PGC =================== The program chain (PGC) can be presented either serially (linear) or in random/shuffle (non-linear) fashion. For example, a quiz title should break each question into separate programmes. The next program chain branched to would be determined by the answered provided by the user. Still ===== Still pictures are coded as MPEG intra frames. They may be displayed for indefinite duration. They can be accompanied by background music, or total audio mute. - still function is created by the action of the navigation system - The same video frame and sub-picture is frozen (displayed over and over again on the TV) while audio is or playing in background. There are three types of the Still Function: Type Timing Still time in seconds PGC Still Stills at end of the PGC 0-254, limitless Cell Still Stills at end of the Cell 0-254, limitless VOBU Still Stills in every VOBU in the Cell limitless VOBU: Video Object Unit. Search Functions by User ======================== There are 6 search functions defined for DVD. Two are present in most of today's VCRs: the linear style Time Search and Scan (Fast forward, rewind). The other 4 are made possible thanks to the non-linear, random-access playback capability of DVD. User operation (ability to scan through or play) can be prohibited by content, identified by such attributes as the parental control level. For example, certain Part_of_Title's can be skipped over which contain R-rated (US) scenes. Title User can select the exact title to shuttle to. Search: Part_of_Title User can go to specific version (PG-13, R, Search: directors cut, children's version) or camera angle by either title name or number. Program User can go to a specific scene (car chase, opening Search: credits, gun fight, etc.) within a program chain. Time User can go to a specific SMPTE style time code Search: (HH:MM:SS:FF) location within a program chain. Scan: Scan (linearly) forward or backwards in time. GoUp: Within the current program chain, jump to the next program chain. This command traverses the DVD control information hierarchy. For Time Searches, all DVD players are required to arrive to the nearest I picture. It is optional that DVD players be capable of arriving at the exact picture (regardless of its picturing coding type). Navigation Commands and Parameters ================================== The author (content provider) is given the freedom of creating an arbitrary branching structure for a given title. Of course some restraint should be exercised since, thanks to interframe MPEG coding dependencies and physical servo mechanism limitations, a program chain cannot be constructed of 30 pictures/sec of totally randomly located information on the disc. However, the constant DVD transfer rate of 11 Mbit/sec provides some flexibility when the average program rate is kept lower. For example, if the average bit rate is only 5 Mbit/sec, then the player can waste 6 Mbit/sec of potential transfer rate in random access overhead. Player Settings: There are 24 system parameters for player setting: SPRM Meaning ---- ----------------------------------------------------------------- 0 Menu Description Language Code 1 Audio stream number 2 Sub-picture Stream number 3 Angle Number 4 Title Number 5 VTS title Number 6 Title PGC Number 7 Part of title number for one sequential_PGC_Title 8 Highlighed Buttom number 9 Navigation Timer 10 Title PGC number for Navigation Timer 11 Audio Mixing Mode for Karaoke 12 Country Code for Parental Management 13 Parental Level 14 Player Configuration for Video 15 Player Configuration for Audio 16 Initial Language Code for Audio 17 Initial Language Code for Sub-picture 18 Initial Language Code Extension for Sub-picture 19 Initial Language Code for Sub-picture 20 Reserved 21 Reserved 22 Reserved 23 Reserved General Parameters: Used for interactive operation of titles, such as quizzes, or games. - 16 general parameters for navigation. These are RAM variables in the DVD players for use as, e.g., arithmetic scratch pads, counters, etc. - Arithmetical operations are available (add, compare, etc.) Navigation Commands =================== - Each command consists of a single instruction or a combination of two or three instructions. Instruction Groups: Goto branch between command Link transfer between same Domain Jump transfer between each Domain Compare recognition of parameter value SetSystem player system setting Set calculate GPRM values Location of each command ======================== Within a program chain (PGC), commands can be located at the front of the chain, in between cells of the chain, and at the end of the chain. Program chain [Pre-Commands] [Cell] [Cell] [Cell-Command] [Cell] [Post-Commands] Each cell can have one command. There is a restriction that no more than 128 commands can be contained within a program chain: Pre-commands + Cell Commands + Post Commands <= 128 Further, there are a maximum of 36 buttons, each of which can have one associated command. Example of a PGC transition =========================== [taken from the Hitachi overheads] 3 quiz problems are presented to the user. Each quiz problem/question is coded as a separate program chain. One of the questions prompts the user for a "Yes" or "No" answer. The Link command is used to branch from the original top-level menu to one of the three program chains. The Set Command is used to tally a score. Finally, the CompareLink command (which consist of two commands, Compare & Link) branches to a particular Program depending on the user's answer. ========================================================== FILE SYSTEM ========================================================== Directory Structure =================== File directory is based on ISO 9660 and the micro Universal Disk Format Specification (M-UDF). The latest UDF specification (November 3, 1995) can be obtained from: Optical Storage Technology Association 311 East Carrillo Street Santa Barbara, CA 93101 USA Voice: +1 805 963 3853 Fax: +1 805 962 1541 E-mail: osta@aol.com Root --------------------------------------------------- | | | Video_TS Audio_TS Provider defined | | | | - Video_TS.INF (Video Manager Information) | - Video_TS.VOB (Video Manager Menu) | - Video_TS.BUP (Video Manager Information) | | - TITLE_A.INF (Video Title Set Information) | - TITLE_A0.INF (Video Title Set Menu) | - TITLE_A1.VOB (Video Title Set Title) | - TITLE_A2.VOB (Video Title Set Title) | - TITLE_A.BUP (Video Title Set Information) Layout of Volume ================ Lead-in Data Recorded Area Lead Out ISO9660&M-UDF File 0 File 1 Disc Type and Capacity ====================== Single layer Dual layer Single layer Dual Layer Single sided Single sided Double sided Double sided 12 cm 4.7 8.5 9.4 17 8 cm 1.4 2.6 2.9 5.3 Recordable time on a disc scenarios: ==================================== Avg. Minutes rate SL/SS DL/SS SL/DS DL/DS Movie 4.8 130 236 259 472 Video 3.5 Audio (AC-3 3 lang) Sub-picture 4 Karaoke 4.0 155 282 310 564 Video 3.5 Audio (AC-3 1 lang) Sub-picture 1 Video Clip A 5.2 120 218 340 436 Video 3.5 Audio (2 ch. PCM) Sub-picture Video Clip B 8.8 71 129 142 258 Video 7.0 Audio (2 ch. PCM) Sub-picture Video Clip C 8.4 75 136 149 272 Video 3.5 Audio Sub-picture ========================================================== PHYSICAL ========================================================== Physical specifications: =============================================== Toshiba provided the following table: 12 cm disc 8 cm disc User Data Capacity Single Layer 4.7 GByte 1.4 GByte Dual Layer 8.5 GByte 2.6 GByte Pit Length (minimum) 0.4 microns Track pitch 0.74 microns recording modulation 8/16 sector size 2048 bytes error correction reed-soloman product code: code RS(208,192,17) x RS (182, 172, 11) ECC Constraint 16 Sectors (=32 Kbytes) Length Further physical specs ====================== Spiral direction clockwise Comparing DVD and CD [Mitsubishi data] ================================================= Units DVD CD Outer diameter millimeters 120 120 Thickness of substrate millimeters 0.6 1.2 track pitch microns 0.74 1.6 min. pit length microns 0.40 SL 0.834 - 0.97 0.44 DL wavelength nanometers 650 780 Numerical Aperture of N/A 0.60 0.45 Objective Lens Error correction N/A RS Product RS 8-bit code Code Error correction percentage 13 25 overhead Data capacity Gigabytes 4.7, 8.5, 0.65 (CD-ROM) 9.4, or 17 0.80 (CD Music) Channel modulation N/A 8/16 8/17 Data bit rate (1X) Mbit/sec 11.08 1.44 Reference scanning meters/sec 3.49 SL, 1.2 to 1.4 velocity 3.84 DL Reflectivity percentage 70 min SL 70 min 25 to 40 DL Thickness of spacing microns 40 - 70 N/A Layer in Dual Layer Spot Size lambda/NA 0.63 1 Focus Depth lambda/NA^2 0.47 1 (Focus Margin) Comatic Aberration lambda/NA^3 0.35 1 (Title margin) Spherical Aberration lambda/NA^4 0.26 1 (Thickness Tolerance) Note: the minimum pit length for Double layer is 10% greater, hence the 10% less dense figure for Dual layer discs. more Toshiba data... Disc specifications: =============================================== 8 cm 12 cm outer diameter: 80 mm 120 mm outer data diameter: 76 mm 116 mm inner data diameter: 48 mm 48 mm Track pitch: 0.74 microns (same as Toshiba original proposal) Pit length: Min: 0.4 micron (same as Toshiba original proposal) Max: 2.13 micron to 1.87 micron Scanning velocity: 3.49 m/sec Channel bitrate: 26.16 Mbit/sec User data bit rate: 11.08 Mbit/sec Recording order on the disc (Track Structure) =============================================== Legend: I Lead-in area (leader space near edge of disc) D Data area (contains actual data) O Lead-out area (leader space near edge of disc) X un-usable area (edge or donut hole) M Middle area (interlayer lead-in/out) B Dummy bonded layer (to make disc 1.2 mm thick instead of 0.6mm) Single layer disc: direction: continuous spiral from inside to outside of disc. | -----------------------> |BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB outer edge |XXIIIDDDDDDDDDDDDDDDDDDDDDOOOXX of disc | reference axis Dual layer disc: (A) Parallel track path (for computer CD-ROM use) Direction: same for both layers. -----------------------> XXIIIDDDDDDDDDDDDDDDDDDDDDOOOXX Layer 1 XXIIIDDDDDDDDDDDDDDDDDDDDDOOOXX Layer 0 -----------------------> (B) Opposite track path (for movies) Direction: opposite directions Since the reference beam and angular velocities are the same at the layer transition point, the delay comes from refocusing. This permits seamless transition for movie playback. <---------------------- XXOOOODDDDDDDDDDDDDDDDDDDDDMMMXX Layer 1 XXIIDDDDDDDDDDDDDDDDDDDDDDDMMMXX Layer 0 -----------------------> Data Sector Configuration ========================= From the original Toshiba DVD proposal (circa Spring 1995), the following three items changed: - sector information in ID - EDC Generation Method - Initial Value of Main Data Scrambling The 2064 byte sector is, for purposes of error correction, organized into 12 separate rows, each with 172 bytes. The first row starts with the 12 Byte sector header (ID, IEC, Reserved bytes), followed by the remaining data bytes. The following 10 rows contain only data. The final row is punctuated with a 4 Byte field (EDC). Row Fields within row --- ---------------------------------------------------------------------- 0 ID (4B) IEC (2B) RESERVED (6B) Main data (160 Bytes: D[0] - D[159]) 1 Main Data (172 Bytes: D[ 160] - D[ 331]) 2 Main Data (172 Bytes: D[ 332] - D[ 503]) 3 Main Data (172 Bytes: D[ 504] - D[ 675]) 4 Main Data (172 Bytes: D[ 676] - D[ 847]) 5 Main Data (172 Bytes: D[ 848] - D[1019]) 6 Main Data (172 Bytes: D[1020] - D[1191]) 7 Main Data (172 Bytes: D[1192] - D[1363]) 8 Main Data (172 Bytes: D[1364] - D[1535]) 9 Main Data (172 Bytes: D[1536] - D[1707]) 10 Main Data (172 Bytes: D[1708] - D[1879]) 11 Main Data (172 Bytes: D[1880] - D[2047]) EDC (4B) ID: Identification Data (32-bit sector number) IEC: ID Error Correction EDC: Error Detection Code EEC Block Configuration ======================= To combat bursty errors characteristic of CD-ROM, 16 sectors are further interleaved together, forming a block of 192 rows (16 sectors * 12 rows/sector = 192 rows). Error correction bytes are concatenated to the data block in a 2-dimensional fashion (hence the term "product" in the phrase "Reed-Soloman product codes"). Specifically: at the end of each row, 10 bytes of RS data is added (hence the RS(182,172,11) vector. At the end of the block, 16 rows of RS data is added (hence the RS(208,192,17) vector). Therefore out of 37,856 total bytes (182*208) for the interleaved block of data, 33,024 bytes (192*172) or roughly 87% is payload. <----- data block -----------> <---------- P1 --------------> D B[ 0][ 0] ... B[ 0][171] | B[ 0][172] .... B[ 0][181] a B[ 1][ 0] ... B[ 1][171] | B[ 1][172] .... B[ 1][181] t . | a . | . | B[190][ 0] ... B[190][171] | B[190][172] .... B[190][181] B[191][ 0] ... B[191][171] | B[191][172] .... B[191][181] -------------------------------------------------------------- B[192][ 0] ... B[192][171] | B[192][172] .... B[192][181] . P . 0 . B[207][ 0] ... B[207][171] | B[207][172] .... B[207][181] P0: RS(182, 172, 11) P1: RS(208, 192, 17) 8/16 Modulation =============== The lowest layer of the communications channel is the 8/16 channel code, which helps reduce DC energy and lower the SNR threshold for the pickup signal. Although half the channel rate is doubled thanks to the 8/16 code, the overall user throughput for the desired uncorrected rate of 1x10^-3 is greater because of it. The advantage of the 8/16 code is: - Small DC component (no long run lengths of 1's or 0's) - Applicable RAM - Simple decoding circuits From 16 channel bits, 8 user data bits are produced. ============================================================= Source: Nimbus Title: Disc Manufacturing Technology and Equipment DVD Laser Beam Recorder - with respect to CD, DVD only requires changes to recorder mask. - Ultra violet laser, argon ion - Wavelength of 351 nanometres - c.5000 hours lifetime - final objective lens, n.a. 0.9 - secondary focusing - aperture for CD mastering - spot beam focus checker is mot critical part. - yield rate for DVD (SS/SL): 90% Operation in DVD or CD mode - Identical glass preparation and chemicals - universal lenses - switchable aperture - secondary focusing - elliptical spot for CD mastering Elliptical spot: - reduces resolution across track - maintains DVD resolution along track to improve control of pit ends. Production: - 200 - 300 master titles per month - 1.2 - 1.5 million stamped discs per month ======================================================= Notes from Hitachi Flow of Data in player: Stage 1: SYNC detection, 8/16 Demodulation, ID Detection A total of 8 sync codes are inserted into the 8/16 modulated channel bitstream representing the current physical sector. Sync code words are unique in the 8/16 code table (so they cannot be generated by the 8-to-16 mapping). Detection looks for sync codes in order to determine where sectors begin and end. Here the channel bit rate input to this block is 26.16 Mbits/sec, and output is 13 Mbit/sec. Stage 2: Error detection and correction If the check bits (EDC) don't match the fingerprint of the unscrambled data, the Reed Soloman bytes (IEC) are used to attempt error correction of the corrupted data. Here the channel rate output by this block is 11 Mbit/sec (2 Mbit/sec of error correction parity data, IEC, has been stripped). Stage 3: Descramble Data on the disc is descrambled for purposes of copy protection. Stage 4: EDC Check The fingerprint of the unscrambled data is checked against the EDC code to verify whether the data was correctly descrambled. Stage 5: Track buffer This FIFO maps the constant user data bit rate of 11.08 Mbit/sec to the variable bit rate (Max mux rate 10.08 Mbit/sec) of the program streams. Stage 6: Transfer to MPEG system decoder. Track Buffer ============ The size of the track buffer is left to the implementation, although the minimum recommended size is 2 Mbit. This is computed as: B > Tmax * VBRmax = 0.104 sec * 10.08 Mbit/sec Tmax is the maximum latency of one disc revolution, and VBRmax is the maximum mux rate for any Program. In some systems, the Track Buffer and the MPEG STD/VBV (System Target Decoder/Video Buffer Verifier) are combined. Seamless playback illustration ============================== Input stream to Track Buffer: Time ----> n: sector number |<------- T --------->| [n-3][n-2][n-1][ n] ... track jump ... [m ][m+1][m+2][m+3][etc.] (no data transfer during discontinuity) Corresponding output from Track Buffer: Initial buffer delay introduced by track buffer |<--------->| [n-3][n-2][n-1][ n][m ][m+1][m+2][m+3][etc.] ^^ no apparent discontinuity from perspective of MPEG Systems decoder. The memory size needed for seamless playback control can be computed as: T * VBR = 0.25 seconds * 8 Mbit/sec = 2 Mbits This is of course implementation-dependent. T here is the maximum jump distance (10,000 sectors). Labeling information [from Warner Advanced Media Operations] ================================================================= Labeling can be similar to standard CD labels or one of three new types: - Reverse Printing: underside of blank 0.6mm clear substrate provides unique wet look and additional protection - Mastered in Graphics: by transferring images directly to the glass master ensuring 100% yield. - Laser Scribed Titles: on stampers, Image added right at press. DVD Doubled Sided Disc Label solution: - the inner radius of DVD is smaller than regular CD's (to improve areal utilization of disc, hence capacity). This favors the outer edge. - labels are printed along outer 5mm edge of disc. Label Angular arc size -------------------------- -------------------- Movie Title Information: 217 degrees Disc ID Code: 57 degrees Side: 25 degrees Company: 29 degrees Gaps between above labels: 8 degrees x 4 gaps ========================================================== Hollywood Panel Discussion ========================================================== Executive leaders from the home video branches of MGM, MCA, Warner, Columbia, Turner (New Line Cinema) and Tri-Star were present. The three primary issues for them are: 1. Availability of Software: getting titles mastered and pressed to entice people to buy DVD players. Most executives on the panel felt that new titles should be in the $20 range, and that "fully amortized" titles (read: talent already received and spent their money) will allow studios to resell their vast libraries. Only one executive felt his studio will market new titles at higher prices----in the $60 range. These older libraries will be priced lower ($10 ?) in order to attract hardware purchases. The idea is to convince DVD owners to own their own libraries rather than rent. 2. Timing of releases around world: studios feel they do not want American titles to be played on European players since the release dates of video (and cinema for that matter) differ around the world for political and logistical (seasons, holidays) reasons. Region Codes would become the mechanism. 3. Copy protection: studios feel that closure on this heated issue is expected within weeks. - currently only Disney, Paramount (Blockbuster), and Fox have yet to announce DVD policies. - It is hoped that DVD will simulate the US music video market, where VHS has clearly proven not to be the format. ----- End of notes cfogg@chromatic.com