The Writings of K. S. Yim

Main Papers Reports Books BibTeX Biography Wiki

	Efficiency & Scalability [1999-]	Dependability & Measurement [2007-]	Quality & Productivity [2011-]	Security & Privacy [2015-]	Usability [2019-]
GenAI Generative Artificial Intelligence		Forecasting arXiv	Test Selection Report		TPU Colab Tutorial'25
IR Information Retrieval	Entity Pivoting Patent App Indexing Report'24 Report'23	App Automation Patent	RAG Report		Benchmark arXiv RL for HCI Report
ML Machine Learning		GPGPU IPDPS'11 Watchdog IPDPS'13 Unsupervised learning IPDPS'14	Bigdata monitoring ISSRE'16	Review bot pre-print Virtual assistant Patent	RecSys Patent
SE Software Engineering	Debugger Patent	Production reliability model BigData'25	Release & deployment ISSRE'14 Testing infra LNCS'11 MapReduce testing Patent	Security assessment Book chapter System fuzzing Videos
OS Operating Systems	Fast booting SAC'05 Low-power WSNs LNCS'04a LNCS'04b Real-time OS LNCS'06 SynergyFS OLS'08 OS jitter Patent	Experimental validation DSN'10 PRDC'09	API compliance for ecosystem Tutorial'17 Android virtual device Patent	Android updatability TECS'19 System security SRDS'16
ComSys Computer Systems	Flash storage JSTS'05 TCE'04 Code optimization RTCSA'06 Compressed memory LNCS'04c PDPTA'03 Reconfigurable processor Patent1 & 2	Co-processor Aerospace'12

My Papers and Patents

K. S. Yim,
"Query Refinement Using Optical Character Recognition,"
United States Patent, No. 12,536,225, January 2026 (Application on December 2023).
K. S. Yim,
“Accelerating Generative Artificial Intelligence and Machine Learning with Tensor Processing Units,”
IEEE International Conference on Big Data, Tutorial, 2025.
Details
K. S. Yim,
"Forecasting Extreme Production Outages in Agile, Big Data and Machine Learning Services: Simple, Two-Parameter Software Reliability Models for Root Cause Insights,"
in Proceedings of the IEEE International Conference on Big Data, 1st Special Session on Cybersecurity and Telecommunication in the Era of AI, December 2025.
IEEE .BIB
Time series forecasting models have diverse real-world applications, yet forecasting sporadic or spiky production outages of cloud computing services remains a challenging target. Traditional one-parameter Software Reliability Growth Models (SRGMs) are inadequate for accurately estimating outages in modern agile software environments for big data computing. This inadequacy stems from the continuous introduction and removal of defects, constantly evolving total defect counts, and non-constant defect detection rates in agile software, further complicated by operational issues like release and deployment challenges contributing to outages. In this paper, we address these limitations by optimizing a fundamental reliability model to estimate aggregated time series of sporadic, spiky production outages of big data machine learning (ML) services. Our analysis utilizes three years of production incident statistics from planet-scale services with billions of users. We conduct a comprehensive curve fitting study across daily, weekly, and monthly aggregated outage counts against a total of 55 standard distribution functions. We empirically demonstrate that two-parameter distributions, specifically beta and wrapped Cauchy, consistently provide the best fit for total production outages across all granularities, highlighting the necessity of multi-parameter models for agile software reliability. Furthermore, by classifying outages by their root cause type (e.g., capacity, client, data, ML, migration) based on manual post-mortem analyses, we find that root cause-specific outages often represent even more extreme events than total outage counts, requiring two- or multi-parameter models for accurate forecasting. This granular understanding is crucial for big data service operators (e.g., on-call engineers) to identify root causes and apply mitigation techniques in a timely manner.
K. S. Yim,
"Evaluation of a Foundational Model and Stochastic Models for Forecasting Sporadic or Spiky Production Outages of High-Performance Machine Learning Services,"
arXiv:2507.01067, 2025.
pre-print .BIB
Time series forecasting models have diverse real world applications (e.g., from electricity metrics to software workload). Latest foundational models trained for time series forecasting show strengths (e.g., for long sequences and in zero-shot settings). However, foundational model was not yet used for forecasting rare, spiky events, i.e., a challenging target because those are a corner case of extreme events. In this paper, we optimize a state-of-the-art foundational model to forecast sporadic or spiky production outages of high-performance machine learning services powering billions of client devices. We evaluate the forecasting errors of the foundational model compared with classical stochastic forecasting models (e.g., moving average and autoregressive). The analysis helps us understand how each of the evaluated models performs for the sporadic or spiky events. For example, it identifies the key patterns in the target data that are well tracked by the foundational model vs. each of the stochastic models. We use the models with optimal parameters to estimate a year-long outage statistics of a particular root cause with less than 6% value errors.
K. S. Yim,
"Actionable Suggestions for Media Content,"
United States Patent, No. 12,242,528, March 2025 (Application on September 2022).
K. S. Yim,
"Generating a selectable suggestion using a provisional machine learning model when use of a default suggestion model is inconsequential,"
United States Patent, No. 12,259,947, March 2025 (Application on May 2020).
K. S. Yim,
"Rendering suggestion for searching entity within application in response to determining the entity is associated with the application,"
United States Patent, No. 12,353,427, July 2025 (Application on June 2023).
K. S. Yim, I. Firman, A. M. Coimbra, R. J. Berry, M. Ionut-Andrecia, M. Reutov, G. O. Taubman, C. S. Kuang, M. Oh, S. R. Ganov, and K. R. Desineni,
"Human-in-the-Loop Voice Automation System,"
United States Patent, No. 12,387,724, August 2025 (Application on May 2022).
K. S. Yim and B. Caprita,
"Mitigating latency and/or resource usage in triggering actionable suggestions related to rendered content,"
United States Patent, No. 12,488,061, December 2025 (Application on October 2022).
K. S. Yim,
"Predicting Likely-Vulnerable Code Changes: Machine Learning-based Vulnerability Protections for Android Open Source Project,"
arXiv:2405.16655, 2024.
pre-print .BIB
This paper presents a framework that selectively triggers security reviews for incoming source code changes. Functioning as a review bot within a code review service, the framework can automatically request additional security reviews at pre-submit time before the code changes are submitted to a source code repository. Because performing such secure code reviews add cost, the framework employs a classifier trained to identify code changes with a high likelihood of vulnerabilities. The online classifier leverages various types of input features to analyze the review patterns, track the software engineering process, and mine specific text patterns within given code changes. The classifier and its features are meticulously chosen and optimized using data from the submitted code changes and reported vulnerabilities in Android Open Source Project (AOSP). The evaluation results demonstrate that our Vulnerability Prevention (VP) framework identifies approximately 80% of the vulnerability-inducing code changes in the dataset with a precision ratio of around 98% and a false positive rate of around 1.7%. We discuss the implications of deploying the VP framework in multi-project settings and future directions for Android security research. This paper explores and validates our approach to code change-granularity vulnerability prediction, offering a preventive technique for software security by preemptively detecting vulnerable code changes before submission.
K. S. Yim,
"The Task-oriented Queries Benchmark (ToQB),"
arXiv:2406.02943, 2024.
Paper GitHub .BIB
Task-oriented queries (e.g., one-shot queries to play videos, order food, or call a taxi) are crucial for assessing the quality of virtual assistants, chatbots, and other large language model (LLM)-based services. However, a standard benchmark for task-oriented queries is not yet available, as existing benchmarks in the relevant NLP (Natural Language Processing) fields have primarily focused on task-oriented dialogues. Thus, we present a new methodology for efficiently generating the Task-oriented Queries Benchmark (ToQB) using existing task-oriented dialogue datasets and an LLM service. Our methodology involves formulating the underlying NLP task to summarize the original intent of a speaker in each dialogue, detailing the key steps to perform the devised NLP task using an LLM service, and outlining a framework for automating a major part of the benchmark generation process. Through a case study encompassing three domains (i.e., two single-task domains and one multi-task domain), we demonstrate how to customize the LLM prompts (e.g., omitting system utterances or speaker labels) for those three domains and characterize the generated task-oriented queries. The generated ToQB dataset is made available to the public. We further discuss new domains that can be added to ToQB by community contributors and its practical applications.
K. S. Yim
"Disparate sourcing of tree data structures for searching and suggesting application services,"
United States Patent, No. 12,153,941, November 26, 2024 (Application on November 6, 2023).
K. S. Yim and Z. Chen
"Launching determination based on login status,"
United States Patent, No. 11,971,801, April 30, 2024 (Application on December 15, 2022).
K. S. Yim,
"Assessment of Security Defense of Native Programs Against Software Faults,"
System Dependability and Analytics, Wang, L., Pattabiraman, K., Di Martino, C., Athreya, A., Bagchi, S. (eds), Springer Series in Reliability Engineering, Springer, Cham., 2023.
Springer Link .BIB
This chapter explores the possibility of building a unified assessment methodology for software reliability and security. The fault injection methodology originally designed for reliability assessment is extended to quantify and characterize the security defense aspect of native applications. Native application refers to system software written in C/C++ programming language. Specifically, software fault injection is used to measure the portion of injected software faults caught by the built-in error detection mechanisms of a target program (e.g., the detection coverage of assertions). To automatically activate as many injected faults as possible, a gray box fuzzing technique is used. Using dynamic analyzers during fuzzing further helps us catch the critical error propagation paths of injected (but undetected) faults, and identify code fragments as targets for security hardening. Because conducting software fault injection experiments for fuzzing is an expensive process, a novel, locality-based fault selection algorithm is presented. The presented algorithm increases the fuzzing failure ratios by 3–19 times, accelerating the speed of experiment. The case studies use all the above experimental techniques in order to compare the effectiveness of fuzzing and testing, and consequently assess the security defense of native benchmark programs.
K. S. Yim, K. Y. Lim, and U. Patil
"Recommending Action(s) Based on Entity or Entity Type,"
United States Patent, No. 11,790,173, October 17, 2023 (Application on October 28, 2020).
K. S. Yim and I. Firman,
"Encoding/decoding user interface interactions,"
United States Patent, No. 11,726,641, August 15, 2023 (Application on February 14, 2022).
K. S. Yim,
"Fulfillment of actionable requests ahead of a user selecting a particular autocomplete suggestion for completing a current user input,"
United States Patent, No. 11,556,707, January 17, 2023 (Application on June 18, 2020).
K. S. Yim, Z. Chen, and B. G. Lim,
"Selectively Rendering a Keyboard Interface in Response to an Assistant Invocation in Certain Circumstances,"
United States Patent, No. 11,481,686, October 25, 2022 (Application on May 13, 2021).
K. S. Yim,
"Open source software testing,"
United States Patent, No. 11,216,357, January 4, 2022 (Application on February 22, 2018).
K. S. Yim and I. Malchev,
"Selective simulation of virtualized hardware inputs,"
United States Patent, No. 11,354,464, June 7, 2022 (Application on June 26, 2020).
K. S. Yim,
"Automated assistant architecture for preserving privacy of application content,"
United States Patent, No. 11,374,887, June 28, 2022 (Application on October 29, 2019).
R. Shah, M. Ben-Ari, and K. S. Yim,
"Automated device test triaging system and techniques”,
United States Patent, No. 11,113,183, September 7, 2021 (Application on November 10, 2017).
K. S. Yim and I. Malchev,
"Selective simulation of virtualized hardware inputs,"
United States Patent, No. 10,740,511, August 11, 2020 (Application on April 25, 2019).
I. B. Malchev and K. S. Yim,
"Operating system validation,”
United States Patent, No. 10,754,765, August 25, 2020 (Application on December 13, 2017).
K. S. Yim, I. Malchev, A. Hsieh, and D. Burke,
"Treble: Fast Software Updates by Creating an Equilibrium in an Active Software Ecosystem of Globally Distributed Stakeholders,”
ACM Transactions on Embedded Computing Systems (TECS), Vol. 18, Issue 5s, Article 104, 23 pages, October 2019. (SCIE, Impact Factor: 1.156)
ACM DL pre-print .BIB
This paper presents our experience with Treble, a two-year initiative to build the modular base in Android, a Java-based mobile platform running on the Linux kernel. Our Treble architecture splits the hardware independent core framework written in Java from the hardware dependent vendor implementations (e.g., user space device drivers, vendor native libraries, and kernel written in C/C++). Cross-layer communications between them are done via versioned, stable inter-process communication interfaces whose backward compatibility is tested by using two API compliance suites. Based on this architecture, we repackage the key Android software components that suffered from crucial post-launch security bugs as separate images. That not only enables separate ownerships but also independent updates of each image by interested ecosystem entities. We discuss our experience of delivering Treble architectural changes to silicon vendors and device makers using a yearly release model. Our experiments and industry rollouts support our hypothesis that giving more freedom to all ecosystem entities and creating an equilibrium are a transformation necessary to further scale the world largest open source ecosystem with over two billion active devices.
K. S. Yim and I. Malchev,
"Multi-layer test suite generation,”
United States Patent, No. 10,482,002, November 19, 2019 (Application on September 20, 2018).
K. S. Yim and I. Malchev,
"Selective simulation of virtualized hardware inputs,"
United States Patent, No. 10,303,720, May 28, 2019 (Application on August 17, 2016).
K. S. Yim, S. R. Seelam, L. L. Fong, A. Iyengar, and J. Lewars,
"Filtering system noises in parallel computer system during thread synchronization,”
United States Patent, No. 10,203,996, February 12, 2019 (Application on December 16 2016).
K. S. Yim and I. Malchev,
"Middleware interface and middleware interface generator,"
United States Patent, No. 10,019,298, July 10, 2018 (Application on August 17, 2016).
K. S. Yim, I. B. Malchev, and D. Burke,
“A Taste of Android Oreo (v8.0) Device Manufacturer,”
ACM Symposium on Operating Systems Principles (SOSP), Tutorial, 2017.
Extended Abstract Slide Website
In 2017, over two billion Android devices developed by more than a thousand device manufacturers (DMs) around the world are actively in use. Historically, silicon vendors (SVs), DMs, and telecom carriers extended the Android Open Source Project (AOSP) platform source code and used the customized code in final production devices. Forking, on the other hand, makes it hard to accept upstream patches (e.g., security fixes). In order to reduce such software update costs, starting from Android v8.0, the new Vendor Test Suite (VTS) splits hardware-independent framework and hardware-dependent vendor implementation by using versioned, stable APIs (namely, vendor interface). Android v8.0 thus opens the possibility of a fast upgrade of the Android framework as long as the underlying vendor implementation passes VTS. This tutorial teaches how to develop, test, and certify a compatible Android vendor interface implementation running below the framework. We use an Android Virtual Device (AVD) emulating an Android smartphone device to implement a user-space device driver which uses formalized interfaces and RPCs, develop VTS tests for that component, execute the extended tests, and certify the extended vendor implementation.
K. S. Yim et al.,
Android VTS and CTS-on-GSI video series, Android Open Source Project (AOSP), first released in 2017.
Videos
K. S. Yim and P. Patnaik,
"Testing Application Software Using Virtual or Physical Devices,"
United States Patent, No. 9,703,691, July 11, 2017 (Application on June 15, 2015).
K. S. Yim, S. R. Seelam, L. L. Fong, A. Iyengar, and J. Lewars,
"Monitoring system noises in parallel computer systems,”
United States Patent, No. 9,558,095, January 31, 2017 (Application on March 30, 2016).
K. S. Yim,
"The Rowhammer Attack Injection Methodology,”
in Proceedings of the IEEE Symposium Reliable Distributed Systems (SRDS), pp. 1-10, September 2016. (Acceptance Ratio: 32.5% = 27/83)
Paper Slide
This paper presents a systematic methodology to identify and validate security attacks that exploit user influenceable hardware faults (i.e., rowhammer errors). We break down rowhammer attack procedures into nine generalized steps where some steps are designed to increase the attack success probabilities. Our framework can perform those nine operations (e.g., pressuring system memory and spraying landing pages) as well as inject rowhammer errors which are basically modeled as ≥3-bit errors. When one of the injected errors is activated, such can cause control or data flow divergences which can then be caught by a prepared landing page and thus lead to a successful attack. Our experiments conducted against a guest operating system of a typical cloud hypervisor identified multiple reproducible targets for privilege escalation, shell injection, memory and disk corruption, and advanced denial-of-service attacks. Because the presented rowhammer attack injection (RAI) methodology uses error injection and thus statistical sampling, RAI can quantitatively evaluate the modeled rowhammer attack success probabilities of any given target software states.
K. S. Yim,
"Evaluation Metrics of Reliability Monitoring Rules of a Big Data Service,”
in Proceedings of the IEEE International Symposium on Software Reliability Engineering (ISSRE), pp. 376-387, October 2016. (Acceptance Ratio: 35% = 45/130)
Paper Slide
This paper presents new metrics to evaluate the reliability monitoring rules of a large-scale big data service. Our target service uses manually-tuned, service-level reliability monitoring rules. Using the measurement data, we identify two key technical challenges in operating our target monitoring system. In order to improve the operational efficiency, we characterize how those rules were manually tuned by the domain experts. The characterization results provide useful information to operators supposed to regularly tune such rules. Using the actual production failure data, we evaluate the same monitoring rules by using standard metrics and the presented metrics. Our evaluation results show the strengths and weaknesses of each metric and show that the presented metrics can further help operators recognize when and which rules need to be re-tuned.
K. S. Yim,
"Fault tolerance model, methods, and apparatuses and their validation techniques,”
United States Patent, No. 9,317,254, April 19, 2016 (Application on December 4, 2013).
K. S. Yim, S. R. Seelam, L. L. Fong, A. Iyengar, and J. Lewars,
"Filtering system noises in parallel computer systems during thread synchronization,”
United States Patent, No. 9,361,202, June 7, 2016 (Application on July 18, 2013).
K. S. Yim,
"Methods and apparatuses for automated testing of streaming applications using MapReduce-like middleware,”
United States Patent, No. 9,298,590, March 29, 2016 (Application on June 26, 2014). *Typo in last name amended.
S.-w. Lee, H.-c. Kim, Y.-s. Shin, M.-k. Jeong, K. S. Yim, J.-j. Yoo, J.-d. Lee,
"Method and apparatus for preventing stack overflow in embedded system,”
United States Patent, No. 9,280,500, March 8, 2016 (Application on January 3, 2008).
J. D. Lee, S.-w. Lee, J.-j. Yoo, Y.-s. Shin, M.-k. Jeong, and K. S. Yim,
"Method, medium and apparatus scheduling tasks in a real time operating system,”
United States Patent, No. 9,009,714, April 14, 2015 (Application on December 19, 2007).
K. S. Yim,
"Norming to Performing: Failure Analysis and Deployment Automation of Big Data Software Developed by Highly Iterative Models,”
in Proceedings of the IEEE International Symposium on Software Reliability Engineering (ISSRE), pp. 144-155, December 2014. (Acceptance Ratio: 25% = 31/124)
Paper Slide
We observe many interesting failure characteristics from Big Data software developed and released using some kinds of highly iterative development models (e.g., agile). ~16% of failures occur due to faults in software deployments (e.g., packaging and pushing to production). Our analysis shows that many such production outages are at least partially due to some human errors rooted in the high frequency and complexity of software deployments. ~51% of the observed human errors (e.g., transcription, education, and communication error types) are avoidable through automation. We thus develop a fault-tolerant automation framework to make it efficient to automate end-to-end software deployment procedures. We apply the framework to two Big Data products. Our case studies show the complexity of the deployment procedures of multi-homed Big Data applications and help us to study the effectiveness of the validation and verification techniques for user-provided automation programs. We analyze the production failures of the two products again after the automation. Our experimental data shows how the automation and the associated procedure improvements reduce the deployment faults and overall failure rate, and improve the feature launch velocity. Automation facilitates more formal, procedure-driven software engineering practices which not only reduce the manual work and human-oriented, avoidable production outages but also help engineers to better understand overall software engineering procedures, making them more auditable, predictable, reliable, and efficient. We discuss two novel metrics to evaluate progress in mitigating human errors and the conditions indicating points to start such transition from owner-driven deployment practice.
K. S. Yim,
"Characterization of Impact of Transient Faults and Detection of Data Corruption Errors in Large-Scale N-Body Programs Using Computational Accelerators,”
in Proceedings of the IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 458-467, May 2014. (Acceptance Ratio: 21.1% = 114/541)
Paper Slide
In N-body programs, trajectories of simulated particles have chaotic patterns if errors are in the initial conditions or occur during some computation steps. It was believed that the global properties (e.g., total energy) of simulated particles are unlikely to be affected by a small number of such errors. In this paper, we present a quantitative analysis of the impact of transient faults in GPU devices on a global property of simulated particles. We experimentally show that a single-bit error in non-control data can change the final total energy of a large-scale N-body program with ~2.1% probability. We also find that the corrupted total energy values have certain biases (e.g., the values are not a normal distribution), which can be used to reduce the expected number of re-executions. In this paper, we also present a data error detection technique for N-body programs by utilizing two types of properties that hold in simulated physical models. The presented technique and an existing redundancy-based technique together cover many data errors (e.g., >97.5%) with a small performance overhead (e.g., 2.3%).
K. S. Yim, J.-j. Yoo, J.-k. Park,
"Data storage system with complex memory and method of operating the same,”
United States Patent, No. 8,812,771, August 19, 2014 (Application on February 12, 2010).
K. S. Yim,
"Storage device including a file system manager for managing multiple storage media,”
United States Patent, No. 8,892,520, November 18, 2014 (Application on January 23, 2009).
K. S. Yim, J.-j. Yoo, J.-w. Kim, S.-j. Ryu, J.-K. Park, J.-d. Lee, Y.-s. Shin,
"Multitasking method and apparatus for reconfigurable array,”
United States Patent, No. 8,645,955, February 4, 2014 (Application on June 12, 2007).
J.-K. Park, K. S. Yim, W.-g. Kim, J.-j. Yoo, K.-h. Kang, C.-s. Im, J.-d. Lee,
"Method, medium and apparatus storing and restoring register context for fast context switching between tasks,”
United States Patent, No. 8,635,627, January 21, 2014 (Application on December 12, 2006).
K. S. Yim,
From Experiment To Design – Fault Characterization and Detection in Parallel Computer Systems Using Computational Accelerators,
Ph.D. Dissertation, University of Illinois at Urbana-Champaign, May 2013.
e-Library
K. S. Yim, Z. Kalbarczyk, and R. K. Iyer,
"Pluggable Watchdog: Transparent Failure Detection for MPI Programs,”
in Proceedings of the IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 489-500, May 2013. (Acceptance Ratio: 21.8% = 108/494)
Paper Slide
This paper presents a framework and its techniques that can detect various types of runtime errors and failures in MPI programs. The presented framework offloads its detection techniques to an external device (e.g., extension card). By developing intelligence on the normal behavioral and semantic execution patterns of monitored parallel threads, the presented external error detectors can accurately and quickly detect errors and failures. This architecture allows us to use powerful detectors without directly using the computing power of the monitored system. The separation of hardware of the monitored and monitoring systems offers an extra advantage in terms of system reliability. We have prototyped our system on a parallel computer system by using an FPGA-based PCI extension card as a monitoring device. We have conducted a fault injection experiment to evaluate the presented techniques using eight MPI-based parallel programs. The techniques cover ~98.5% of faults, on average. The average performance overhead is 1.8% for techniques that detect crash and hang failures and 6.6% for techniques that detect SDC failures.
K. S. Yim, J. C. Son, B. Y. Chung,
"Storage device reducing a memory management load and computing system using the storage device,”
United States Patent, No. 8,443,144, May 14, 2013 (Application on July 21, 2008).
H.-c. Kim, K. S. Yim, S.-W. Lee, J.-j. Yoo, J.-d. Lee, Y.-s. Shin,
"Apparatus and method of detecting errors in embedded software,”
United States Patent, No. 8,589,889, November 19, 2013 (Application on November 13, 2007).
K. S. Yim, V. Sieda, Z. Kalbarczyk, D. Chen, and R. K. Iyer,
"A Fault-Tolerant Programmable Voter for Software-Based N-Modular Redundancy,”
in Proceedings of the IEEE Aerospace Conference, 20 pages, March 2012.
Paper Slide
This paper presents a fault-tolerant, programmable voter architecture for software-implemented N-tuple modular redundant (NMR) computer systems. Software NMR is a cost-efficient solution for high-performance, mission-critical computer systems because this can be built on top of commercial off-the-shelf (COTS) devices. Due to the large volume and randomness of voting data, software NMR system requires a programmable voter. Our experiment shows that voting software that executes on a processor has the time-of-check-to-time-of-use (TOCTTOU) vulnerabilities and is unable to tolerate long duration faults. In order to address these two problems, we present a special-purpose voter processor and its embedded software architecture. The processor has a set of new instructions and hardware modules that are used by the software in order to accelerate the voting software execution and address the identified two reliability problems. We have implemented the presented system on an FPGA platform. Our evaluation result shows that using the presented system reduces the execution time of error detection codes (commonly used in voting software) by 14% and their code size by 56%. Our fault injection experiments validate that the presented system removes the TOCTTOU vulnerabilities and recovers under both transient and long duration faults. This is achieved by using 0.7% extra hardware in a baseline processor.
K. S. Yim, C. H. Lee,
"Memory device and management method of memory device,”
United States Patent, No. 8,321,624, November 27, 2012 (Application on May 27, 2009).
K. S. Yim, G. S. Choi,
"Storage management method and system using the same,”
United States Patent, No. 8,171,239, May 1, 2012 (Application on March 20, 2008).
Y.-S. Shin, S.-w. Lee, H.-C. Kim, J.-j. Yoo, J.-d. Lee, M.-k. Jeong, and K. S. Yim,
"Transmitting and receiving method and apparatus in real-time system,”
United States Patent, No. 8,194,658, June 5, 2012 (Application on November 7, 2007).
K. S. Yim, J.-k. Park, J.-j. Yoo, J.-d. Lee, C.-s. Im, Y.-S. Shin,
"Kernel-Aware Debugging System, Medium, and Method,"
United States Patent, No. 8,239,838, August 7, 2012 (Application on May 7, 2007); EU Patents, No. EP1870810A2, No. EP1870810A3.
J. D. Lee, K. S. Yim, W. G. Kim, J. J. Yoo, J. K. Park, C.-W. Baek, C. S. Im,
"Method and System for Providing Context Switch using Multiple Register File,"
United States Patent, No. 8,327,122, December 4, 2012 (Application on March 2, 2007).
K. S. Yim, J. J. Yoo, J. K. Park, C. S. Im, J. D. Lee, W. G. Kim, S. H. Choi,
"Method, System, and Medium for Providing Interprocessor Data Communication,”
United States Patent, No. 8,127,110, February 28, 2012 (Application on January 17, 2007).
K. S. Yim, C. Pham, M. Saleheen, Z. T. Kalbarczyk, and R. K. Iyer,
"Hauberk: Lightweight Silent Data Corruption Error Detector for GPGPU,”
in Proceedings of the IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 287-300, May 2011. (Acceptance Rate: 19.6% = 112/571)
Paper Slide
High performance and relatively low cost of GPU-based platforms provide an attractive alternative for general purpose high performance computing (HPC). However, the emerging HPC applications have usually stricter output correctness requirements than typical GPU applications (i.e., 3D graphics). This paper first analyzes the error resiliency of GPGPU platforms using a fault injection tool we have developed for commodity GPU devices. On average, 16-33% of injected faults cause silent data corruption (SDC) errors in the HPC programs executing on GPU. This SDC ratio is significantly higher than that measured in CPU programs (<2.3%). In order to tolerate SDC errors, customized error detectors are strategically placed in the source code of target GPU programs so as to minimize performance impact and error propagation and maximize recoverability. The presented HAUBERK technique is deployed in seven HPC benchmark programs and evaluated using a fault injection. The results show a high average error detection coverage (~87%) with a small performance overhead (~15%).
K. S. Yim, D. Hreczany, and R. K. Iyer,
"A Hybrid Testing Automation Framework to Leverage Local and Global Computing Resources,”
Lecture Notes in Computer Science (LNCS), 6784:479-494, June 2011. (Impact Factor: 0.97)
LNCS
In web application development, testing forms an increasingly large portion of software engineering costs due to the growing complexity and short time-to-market of these applications. This paper presents a hybrid testing automation framework (HTAF) that can automate routine works in testing and releasing web software. Using this framework, an individual software engineer can easily describe his routine software engineering tasks and schedule these described tasks by using both his local machine and global cloud computers in an efficient way. This framework is applied to commercial web software development processes. Our industry practice shows four example cases where the hybrid and decentralized architecture of HTAF is helpful at effectively managing both hardware resources and manpower required for testing and releasing web applications.
K. S. Yim, J.-j. Yoo, Y.-s. Shin, S.-w. Lee, H.-c. Kim, J.-d. Lee, M.-k. Jeong,
"Method of managing memory in multiprocessor system on chip,”
United States Patent, No. 7,996,630, August 9, 2011 (Application on August 11, 2010).
J. D. Lee, S.-w. Lee, J.-j. Yoo, Y.-s. Shin, M.-k. Jeong, and K. S. Yim,
"Method, medium and apparatus managing memory,”
United States Patent, No. 7,895,408, February 22, 2011 (Application on December 20, 2007).
K. S. Yim, Z. T. Kalbarczyk, and R. K. Iyer,
"Measurement-based Analysis of Fault and Error Sensitivities of Dynamic Memory,”
in Proceedings of the IEEE International Conference on Dependable Systems and Networks (DSN), pp. 431-436, June 2010. (Practical Experience Report)
Paper Slide
This paper presents a measurement-based analysis of the fault and error sensitivities of dynamic memory. We extend a software-implemented fault injector to support data-type-aware fault injection into dynamic memory. The results indicate that dynamic memory exhibits about 18 times higher fault sensitivity than static memory, mainly because of the higher activation rate. Furthermore, we show that errors in a large portion of static and dynamic memory space are recoverable by simple software techniques (e.g., reloading data from a disk). The recoverable data include pages filled with identical values (e.g., ‘0’) and pages loaded from files unmodified during the computation. Consequently, the selection of targets for protection should be based on knowledge of recoverability rather than on error sensitivity alone.
K. S. Yim, J.-j. Yoo, Y.-s. Shin, S.-w. Lee, H.-c. Kim, J.-d. Lee, M.-k. Jeong,
"Method of managing memory in multiprocessor system on chip,”
United States Patent, No. 7,805,582, Septemer 28, 2010 (Application on September 13, 2007).
K. S. Yim, J. W. Kim, S. J. Ryu, J. K. Park, J. J. Yoo, D.-H. Yoo, C. S. Im, J. D. Lee, H. S. Kim,
"Method, medium, and Apparatus with Interrupt Handling in a Reconfigurable Array,"
United States Patent, No. 7,836,291, November 16, 2010 (Application on October 17, 2006).
K. S. Yim, J.-j. Yoo, J.-k. Park,
"Data storage system with complex memory and method of operating the same,”
United States Patent, No. 7,689,761, March 30, 2010 (Application on July 13, 2006).
K. S. Yim, J. D. Lee, J.-J. Yoo, K.-h. Kang, J.-k. Park, C.-s. Im, W.-g. Kim, C.-w. Baek,
"Method for reducing code size of a program in code memory by dynamically storing an instruction into a memory location following a group of instructions indicated by an offset operand and either a length operand or a bitmask operand of an echo instruction,”
United States Patent, No. 7,831,809, November 9, 2010 (Application on August 28, 2006).
K. S. Yim, Z. T. Kalbarczyk, and R. K. Iyer,
"Quantitative Analysis of Long Latency Failure in System Software,”
in Proceedings of the IEEE Pacific Rim International Symposium on Dependable Computing (PRDC), pp. 23-30, November 2009.
Paper Slide
This paper presents a study on long latency failures using accelerated fault injection. The data collected from the experiments are used to analyze the significance, causes, and characteristics of long latency failures caused by soft errors in the processor and the memory. The results indicate that a non-negligible portion of soft errors in the code and data memory lead to long latency failures. The long latency failures are caused by errors with long fault activation times and errors causing failures only under certain runtime conditions. On the other hand, less than 0.5% of soft errors in the processor registers used in kernel mode lead to a failure with latency longer than a thousand seconds. This is due to a strong temporal locality of the register values. The study shows also that the obtained insight can be used to guide design and placement (in the application code and/or system) of application-specific error detectors.
K. S. Yim and J. C. Son,
"SynergyFS: A Stackable File System Creating Synergies Between Heterogeneous Storage Devices,”
in Proceedings of the Ottawa Linux Symposium (OLS), pp. 255-259, July 2008.
Paper
Hybrid storage architecture is one efﬁcient method that can optimize the I/O performance, cost, and power consumption of storage systems. Thanks to the advances in semiconductor and optical storage technology, its application area is being expanded. It tries to store data to the most proper medium by considering I/O locality of the data. Data management between heterogeneous storage media is important, but it was manually done by system users.This paper presents an automatic management technique for a hybrid storage architecture. A novel software layer is deﬁned in the kernel between virtual and physical ﬁle systems. The proposed layer is a variant of stackable ﬁle systems, but is able to move ﬁles between heterogeneous physical ﬁle systems. For example, by utilizing the semantic information (e.g., ﬁle type and owner process), the proposed system optimizes the I/O performance without any manual control. Also as the proposed system concatenates the storage space of physical ﬁle systems, its space overhead is negligible. Speciﬁc characteristics of the proposed systems are analyzed through performance evaluation.
K. S. Yim, J. D. Lee, J. Park, C. Im, J.-J. Yoo, and Y. Ryu,
"A Software Reproduction of Virtual Memory for Deeply Embedded Systems,”
Lecture Notes in Computer Science (LNCS), Springer-Verlag, 3980:1000-1009, May 2006. (SCIE, Impact Factor: 1.21)
Paper Slide
Both the hardware cost and power consumption of computer systems heavily depend on the size of main memory, namely DRAM. This becomes important especially in tiny embedded systems (e.g., micro sensors) since they are produced in a large-scale and have to operate as long as possible, e.g., ten years. Although several methods have been developed to reduce the program code and data size, most of them need extra hardware devices, making them unsuitable for the tiny systems. For example, virtual memory system needs both MMU and TLB devices to execute large-size program on a small memory. This paper presents a software reproduction of the virtual memory system especially focusing on paging mechanism. In order to logically expand the physical memory space, the proposed method compacts, compresses, and swaps in/out heap memory blocks, which typically form over half of the whole memory size. A prototype implementation verifies that the proposed method can expand memory capacity by over twice. As a result, large size programs run in parallel with a reasonable overhead, comparable to that of hardware-based VM systems.
L.-z. Han, Y. Ryu, and K. S. Yim,
"CATA: A Garbage Collection Scheme for Flash Memory File Systems,"
Lecture Notes in Computer Science (LNCS), 4159:103-112, September 2006. (SCIE, Impact Factor: 1.21)
LNCS
K. S. Yim, J.-J. Yoo, J. D. Lee, and J. Kim,
"Operating System Support for Procedural Abstraction in Embedded Systems,”
in Proceedings of the IEEE International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA), pp. 378-382, August 2006. (Acceptance Ratio: 32%)
Paper Slide
Procedural abstraction reduces code size by replacing repeated code fragments with call instructions to a sub-routine that executes the repeated fragment. However, in order to build a subroutine, extra instructions are necessary to support the procedural call mechanism. In this paper, we present an operating system level technique which improves the space efficiency of a procedural abstraction-based code compaction technique. The call-related extra instructions are not used in the proposed technique because operating system routines implicitly supports the procedure call and return. The proposed technique consists of three execution modes including one applicable to ROM-based systems. The experimental results show the proposed technique reduces the code size significantly while increasing the execution time slightly.
K. S. Yim,
"A Novel Memory Hierarchy for Flash Memory Based Storage Systems,”
Journal of Semiconductor Technology and Science (JSTS), ISSN 1598-1657, 5(4):69-76, December 2005. (Impact Factor: 0.13)
Paper
Semiconductor scientists and engineers ideally desire the faster but the cheaper non-volatile memory devices. In practice, no single device satisfies this desire because a faster device is expensive and a cheaper is slow. Therefore, in this paper, we use heterogeneous non-volatile memories and construct an efficient hierarchy for them. First, a small RAM device (e.g., MRAM, FRAM, and PRAM) is used as a write buffer of flash memory devices. Since the buffer is faster and does not have an erase operation, write can be done quickly in the buffer, making the write latency short. Also, if a write is requested to a data stored in the buffer, the write is directly processed in the buffer, reducing one write operation to flash storages. Second, we use many types of flash memories (e.g., SLC and MLC flash memories) in order to reduce the overall storage cost. Specifically, write requests are classified into two types, hot and cold, where hot data is vulnerable to be modified in the near future. Only hot data is stored in the faster SLC flash, while the cold is kept in slower MLC flash or NOR flash. The evaluation results show that the proposed hierarchy is effective at improving the access time of flash memory storages in a cost-effective manner thanks to the locality in memory accesses.
K. S. Yim, J. Kim, and K. Koh,
"A Fast Start-Up Technique for Flash Memory Based Computing Systems,”
in Proceedings of the ACM Symposium on Applied Computing (SAC), pp. 852-858, March 2005. (Acceptance Ratio: 36.39% = 278/764)
Paper
Flash memory based embedded computing systems are becoming increasingly prevalent. These systems typically have to provide an instant start-up time. However, we observe that mounting a file system for flash memory takes 1 to 25 seconds mainly depending on the flash capacity. Since the flash chip capacity is doubled in every year, this mounting time will soon become the most dominant reason of the delay of system start-up time. Therefore, in this paper, we present instant mounting techniques for flash file systems by storing the in-memory file system metadata to flash memory when unmounting the file system and reloading the stored metadata quickly when mounting the file system. These metadata snapshot techniques are specifically developed for NOR- and NAND-type flash memories, while at the same time, overcoming their physical constraints. The proposed techniques check the validity of the stored snapshot and use the proposed fast crash recovery techniques when the snapshot is invalid. Based on the experimental results, the proposed techniques can reduce the flash mounting time by about two orders of magnitude over the existing de facto standard flash file system.
K. S. Yim,
"Studies on compressed memory architectures and decentralized communication systems for ubiquitous computing,”
Master’s Thesis, Seoul National U., February 2005.
K. S. Yim, H. Bahn, and K. Koh,
"A Flash Compression Layer for SmartMedia Card Systems,”
IEEE Transactions on Consumer Electronics (TCE), 50(1):192-197, February 2004. (Impact Factor: 0.73)
Paper Slide
Flash memory based SmartMedia Card is becoming increasingly popular as data storage for mobile consumer electronics. Since flash memory is an order of magnitude more expensive than magnetic disks, data compression can be effectively used in managing flash memory based storage systems. However, compressed data management in flash memory is challenging because it only supports page-based I/Os. For example, when the size of compressed data is smaller than the page size, internal fragmentation occurs and this degrades the effectiveness of compression seriously. In this paper, we developed a flash compression layer (FCL) for the SmartMedia Card systems. FCL stores several small compressed pages into one physical page by using a write buffer. Based on prototype implementation and simulation studies, we show that the proposed system offers the storage of flash memory more than 140% of its original size and expands the write bandwidth significantly.
K. S. Yim, J. Kim, and K. Koh,
"An Energy-Efficient Reliable Transport for Wireless Sensor Networks,"
Lecture Notes in Computer Science (LNCS), Springer-Verlag, 3090:54-64, February 2004. (SCIE, Impact Factor: 1.45)
Paper Slide
In a wireless sensor network, sensor devices are connected by unreliable radio channels. Thus, the reliable packet delivery is an important design challenge. The existing sensor-to-base reliable transport mechanism, however, depends on a centralized manager node, incurring large control overheads of synchronizing reporting frequencies. In this paper, we present a decentralized reliable transport (DRT) with two novel decentralized reliability control schemes. First, we propose an independent reporting scheme where each sensor node stochastically makes reporting decisions. Second, we describe a cooperative reporting scheme where every sensor node implicitly cooperates with its neighbors for the uniform reporting. In the reporting step, DRT uses a reliable MAC channel, which is specifically optimized for reducing the energy dissipation. Experimental results show that DRT satisfies the desired delivery rate reliably in a decentralized manner while it significantly reduces the energy consumption of the radio device and the communication time.
K. S. Yim, J. Kim, and K. Koh,
"An Energy-Efficient Routing and Reporting Scheme to Exploit Data Similarities in Wireless Sensor Networks,"
Lecture Notes in Computer Science (LNCS), Springer-Verlag, 3207:515-527, August 2004. (SCIE, Impact Factor: 1.45)
Paper Slide
Wireless sensor networks are based on a large number of tiny sensor nodes, which collect various types of physical data. These sensors are typically energy-limited and low-power operation is an important design constraint. In this paper, we propose a novel routing and reporting scheme based on sample data similarities commonly observed in sensed data. Based on reliable transport protocols, the proposed scheme takes advantage of the spatial and temporal similarities of the sensed data, reducing both the number of sensor nodes that are asked to report data and the frequency of those reports. Experimental results show that the proposed scheme can significantly reduce the communication energy consumption of a wireless sensor network while incurring only a small degradation in sensing accuracy.
K. S. Yim, J.-S. Lee, J. Kim, S.-D. Kim, and K. Koh,
"A Space-Efficient On-Chip Compressed Cache Organization for High Performance Computing,"
Lecture Notes in Computer Science (LNCS), Springer-Verlag, 3358:952-964, December 2004. (SCIE, Impact Factor: 1.45)
Paper Slide
In order to alleviate the ever-increasing processor-memory performance gap of high-end parallel computers, on-chip compressed caches have been developed that can reduce the cache miss count and off-chip memory traffic by storing and transferring cache lines in a compressed form. However, we observed that their performance gain is often limited due to their use of the coarse-grained compressed cache line management which incurs internally fragmented space. In this paper, we present the fine-grained compressed cache line management which addresses the fragmentation problem, while avoiding an increase in the metadata size such as tag field and VM page table. Based on the SimpleScalar simulator with the SPEC benchmark suite, we show that over an existing compressed cache system the proposed cache organization can reduce the memory traffic by 15%, as it delivers compressed cache lines in a fine-grained way, and the cache miss count by 23%, as it stores up to three compressed cache lines in a physical cache line.
K. S. Yim, H. Cha, and K. Koh,
"NIC-NET: A Host-Independent Network Solution for High-End Network Servers,"
Lecture Notes in Computer Science (LNCS), Springer-Verlag, 3320:401-405, December 2004. (SCIE, Impact Factor: 1.45)
Paper Slide
This paper discusses a host-independent network system where a network interface card is utilized in an efficient way. By eliminating protocol stack processing overheads from host system, the proposed system improves the communication speed by 11-36% under heavy network and CPU loads.
K. S. Yim, S. Lee, and K. Koh,
"An Energy-Efficient Compression Algorithm for Wireless Communication,"
in Proceedings of the International SoC Design Conference (ISOCC), pp. 578-581, November 2004.
K. S. Yim, J. Kim, and K. Koh,
"Performance Analysis of On-Chip Cache and Main Memory Compression Systems for High-End Parallel Computers,"
in Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA), pp. 469-475, June 2003.
Paper
Cache and memory compression systems have been developed for improving memory system performance of high-performance parallel computers. Cache compression systems can reduce on-chip cache miss rate and off-chip memory traffic by storing and transferring cache lines in compressed form, while memory compression systems can expand main memory capacity by storing memory pages in compressed form. However, these systems have not been quantitatively evaluated on an identical condition, making it difficult to understand the performance of a new system relative to the existing systems. In this paper, we provide an identical execution-driven simulation environment for these systems. To the best of our knowledge, none has been evaluated the performance of cache and memory compression systems by using an execution-driven simulator. Experimental results show that cache compression systems reduce cache miss rate by 16% and memory traffic by 30%, while it expands memory capacity by less than 160%. The results also show that memory compression systems significantly expand memory capacity by over 270%. Based on these experimental analyses, we finally provide future research directions on the compression systems.
K. S. Yim,
"Utilizing Network Interface Card for Host-Independent Network Systems,”
in Proceedings of Samsung Human-tech Thesis Prize (Undergrad Division), February 2002.
K. S. Yim,
"The Perfect Analysis of Linux Kernel for Implementing General-Purpose Operating System,"
distributed to the open source communities, August 2001.
K. S. Yim,
"Design and Implementation of a General-Purpose Remote Control System,"
in Proceedings of the Samsung Human-tech Thesis Prize (Undergrad Division), February 2001.

Return to the top.

Contents

My Papers and Patents