Speech in Mobile and Pervasive Environments
分類: 图书,进口原版,Computers & Internet(计算机与网络),Web Development(网络发展),Web Design,
品牌: Nitendra RajputAmit Anil Nanavati
基本信息出版社:Wiley; 1 (2012年4月3日)丛书名:Wireless Communications and Mobile Computing精装:312页正文语种:英语ISBN:0470694351条形码:9780470694350商品尺寸:16.4 x 2.6 x 23.3 cm商品重量:594 gASIN:0470694351商品描述内容简介This book provides a cross-disciplinary reference to speech in mobile and pervasive environmentsSpeech in Mobile and Pervasive Environments addresses the issues related to speech processing on resource-constrained mobile devices. These include speech recognition in noisy environments, specialised hardware for speech recognition and synthesis, the use of context to enhance recognition and user experience, and the emerging software standards required for interoperability. This book takes a multi-disciplinary look at these matters, while offering an insight into the opportunities and challenges of speech processing in mobile environs. In developing regions, speech-on-mobile is set to play a momentous role, socially and economically; the authors discuss how voice-based solutions and applications offer a compelling and natural solution in this setting.Key FeaturesProvides a holistic overview of all speech technology related topics in the context of mobility
Brings together the latest research in a logically connected way in a single volumeCovers hardware, embedded recognition and synthesis, distributed speech recognition, software technologies, contextual interfacesDiscusses multimodal dialogue systems and their evaluationIntroduces speech in mobile and pervasive environments for developing regionsThis book provides a comprehensive overview for beginners and experts alike. It can be used as a textbook for advanced undergraduate and postgraduate students in electrical engineering and computer science. Students, practitioners or researchers in the areas of mobile computing, speech processing, voice applications, human-computer interfaces, and information and communication technologies will also find this reference insightful. For experts in the above domains, this book complements their strengths. In addition, the book will serve as a guide to practitioners working in telecom-related industries.目录About the Series Editors xiiiList of Contributors xvForeword xviiPreface xixAcknowledgments xxiii1 Introduction 11.1 Application design 31.2 Interaction modality 31.3 Speech processing 41.4 Evaluations 52 Mobile Speech Hardware: The Case for Custom Silicon 72.1 Introduction 72.2 Mobile hardware: Capabilities and limitations 112.2.1 Looking inside a mobile device: Smartphone example 112.2.2 Processing limitations 142.2.3 Memory limitations 162.2.4 Power limitations 192.2.5 Silicon technology and mobile hardware 222.3 Profiling existing software systems 242.3.1 Speech recognition overview 242.3.2 Profiling techniques summary 252.3.3 Processing time breakdown 272.3.4 Memory usage 292.3.5 Power and energy breakdown 302.3.6 Summary 322.4 Recognizers for mobile hardware: Conventional approaches 322.4.1 Reduced-resource embedded recognizers 332.4.2 Network recognizers 352.4.3 Distributed recognizers 362.4.4 An alternative approach: Custom hardware 382.5 Custom hardware for mobile speech recognition 382.5.1 Motivation 382.5.2 Hardware implementation: Feature extraction 402.5.3 Hardware implementation: Feature scoring 412.5.4 Hardware implementation: Search 442.5.5 Hardware implementation: Performance and power evaluation 472.5.6 Hardware implementation: Summary 492.6 Conclusion 49Bibliography 503 Embedded Automatic Speech Recognition and Text-to-Speech Synthesis 573.1 Automatic speech recognition 573.2 Mathematical formulation 583.3 Acoustic parameterization 603.3.1 Landmark-based approach 643.4 Acoustic modeling 643.4.1 Unit selection 643.4.2 Hidden Markov models 663.5 Language modeling 693.6 Modifications for embedded speech recognition 713.6.1 Feature computation 713.6.2 Likelihood computation 753.7 Applications 773.7.1 Car navigation systems 773.7.2 Smart homes 783.7.3 Interactive toys 783.7.4 Smartphones 793.8 Text-to-speech synthesis 793.9 Text to speech in a nutshell 803.10 Front end 813.11 Back end 843.11.1 Rule-based synthesis 843.11.2 Data-driven synthesis 863.11.3 Statistical parameteric speech synthesis 903.12 Embedded text-to-speech 913.13 Evaluation 923.14 Summary 94Bibliography 944 Distributed Speech Recognition 994.1 Elements of distributed speech processing 1004.2 Front-end processing 1014.2.1 Device requirements 1034.2.2 Transmission issues in DSR 1044.2.3 Back-end processing 1054.3 ETSI standards 1064.3.1 Basic front-end standard ES 201 108 1074.3.2 Noise-robust front-end standard ES 202 050 1074.3.3 Tonal-language recognition standard ES 202 211 1074.4 Transfer protocol 1084.4.1 Signaling 1094.4.2 RTP payload format 1094.5 Energy-aware distributed speech recognition 1104.6 ESR, NSR, DSR 111Bibliography 1135 Context in Conversation 1155.1 Context modeling and aggregation 1155.1.1 An example of composer specification 1215.2 Context-based speech applications: Conspeakuous 1225.2.1 Conspeakuous architecture 1245.2.2 B-Conspeakuous 1255.2.3 Learning as a source of context 1255.2.4 Implementation 1275.2.5 A tourist portal application 1305.3 Context-based speech applications: Responsive information architect 1325.4 Conclusion 133Bibliography 1346 Software: Infrastructure, Standards, Technologies 1376.1 Introduction 1376.2 Mobile operating systems 1396.3 Voice over internet protocol 1406.3.1 Implications for mobile speech 1416.3.2 Sample speech applications 1426.3.3 Access channels 1426.4 Standards 1436.5 Standards: VXML 1446.6 Standards: VoiceFleXML 1456.6.1 Brief overview of speech-based systems 1476.6.2 System architecture 1486.6.3 System architecture:VoiceFleXMLinterpreter 1506.6.4VoiceFleXML: Voice browser 1556.6.5 A prototype implementation 1596.7 SAMVAAD 1636.7.1 Background and problem setting 1656.7.2 Reorganization algorithms 1666.7.3 Minimizing the number of dialogs 1676.7.4 Hybrid call-flows 1716.7.5 Minimally altered call-flows 1726.7.6 Device-independent call-flow characterization 1746.7.7 SAMVAAD: Architecture, implementation and experiments 1756.7.8 Splitting dialog call-flows 1806.8 Conclusion 1886.9 Summary and future work 188Bibliography 1897 Architecture of Mobile Speech-Based and Multimodal Dialog Systems 1917.1 Introduction 1917.2 Multimodal architectures 1937.3 Multimodal frameworks 1957.4 Multimodal mobile applications 1967.4.1 Mobile companion 1977.4.2 MUMS 1997.4.3 TravelMan 2007.4.4 Stopman 2037.5 Architectural models 2067.5.1 Client–server systems 2077.5.2 Dialog description systems 2087.5.3 Generic model for distributed mobile multimodal speech systems 2107.6 Distribution in the Stopman system 2117.7 Conclusions 214Bibliography 2148 Evaluation of Mobile and Pervasive Speech Applications 2198.1 Introduction 2208.1.1 Spoken interaction 2208.1.2 Mobile-use context 2228.1.3 Speech and mobility 2238.2 Evaluation of mobile speech-based systems 2248.2.1 User interface evaluation methodology 2258.2.2 Technical evaluation of speech-based systems 2268.2.3 Usability evaluations 2278.2.4 Subjective metrics and objective metrics 2288.2.5 Laboratory and field studies 2308.2.6 Simulating mobility in the laboratory 2318.2.7 Studying social context 2328.2.8 Long- and short-term studies 2328.2.9 Validity 2338.3 Case studies 2358.3.1 STOPMAN evaluation 2358.3.2 TravelMan evaluation 2408.3.3 Discussion 2478.4 Theoretical measures for dialog call-flows 2488.4.1 Introduction 2488.4.2 Dialog call-flow characterization 2508.4.3 (m,q,a)-characterization 2518.4.4 (m,q,a)-complexity 2538.4.5 Call-flow analysis using (m,q,a)-complexity 2548.5 Conclusions 257Bibliography 2589 Developing Regions 2639.1 Introduction 2649.2 Applications and studies 2649.2.1 VoiKiosk 2659.2.2 HealthLine 2679.2.3 The spoken web 2689.2.4 TapBack 2719.3 Systems 2759.4 Challenges 278Bibliography 278Index 281