目录

1. Active network alignment: A matching-based approach [中英文摘要]

2. An anomaly detection technique for business processes based on extended dynamic Bayesian networks [中英文摘要]

3. Complex mapping discovery for semantic process model alignment [中英文摘要]

4. Discriminative word alignment with conditional random fields [中英文摘要]

5. Multiple Sequence Alignment Using Tabu Search [中英文摘要]

6. Temporal alignment using the incremental unit framework [中英文摘要]

7. Text alignment in the real world [中英文摘要]

8. Using normalized alignment scores to detect incorrectly aligned segments [中英文摘要]

9. Alignment by Maximization of Mutual Information [中英文摘要]

10. An anomaly detection technique for business processes based on extended dynamic Bayesian networks [中英文摘要]

11. Complex mapping discovery for semantic process model alignment [中英文摘要]

12. Data-aware process mining: Discovering decisions in processes using alignments [中英文摘要]

13. Data-aware process mining: Discovering decisions in processes using alignments [中英文摘要]

14. Discriminative word alignment via alignment matrix modeling [中英文摘要]

15. Measuring the alignment between business processes and software systems: A case study [中英文摘要]

16. Pairwise sequence alignment algorithms - A survey [中英文摘要]

17. Process mining for clinical processes: A comparative analysis of four australian hospitals [中英文摘要]

18. Process mining for clinical processes: A comparative analysis of four australian hospitals [中英文摘要]

19. Semantic E-workflow composition [中英文摘要]

20. Triangular Alignment TAME: A Tensor-Based Approach for Higher-Order Network Alignment [中英文摘要]

21. Using normalized alignment scores to detect incorrectly aligned segments [中英文摘要]


摘要

[1] Active network alignment: A matching-based approach (2017)

(Malmi, Eric and Gionis, Aristides and Terzi, Evimaria | )

Abstract: Network alignment is the problem of matching the nodes of two graphs, maximizing the similarity of the matched nodes and the edges between them. This problem is encountered in a wide array of applications-from biological networks to social networks to ontologies-where multiple networked data sources need to be integrated. Due to the di?culty of the task, an accurate alignment can rarely be found without human assistance. Thus, it is of great practical importance to develop network alignment algorithms that can optimally leverage experts who are able to provide the correct alignment for a small number of nodes. Yet, only a handful of existing works address this active network alignment setting. The majority of the existing active methods focus on absolute queries (are nodes a and b the same or note), whereas we argue that it is generally easier for a human expert to answer relative queries (which node in the set fb1; : : : ;bn g is the most similar to node a). This paper introduces two novel relative-query strategies, TopMatchings and GibbsMatchings, which can be applied on top of any network alignment method that constructs and solves a bipartite matching problem. Our methods identify the most informative nodes to query by sampling the matchings of the bipartite graph associated to the network-alignment instance. We compare the proposed approaches to several commonly-used query strategies and perform experiments on both synthetic and real-world datasets. Our sampling-based strategies yield the highest overall performance, outperforming all the baseline methods by more than 15 percentage points in some cases. In terms of accuracy, TopMatchings and GibbsMatchings perform comparably. However, GibbsMatchings is significantly more scalable, but it also requires hyperparameter tuning for a temperature parameter.

摘要: 网络对齐是匹配两个图的节点的问题,它使匹配的节点及其之间的边的相似度最大化。从生物网络到社交网络再到本体,在其中需要集成多个网络数据源的各种各样的应用程序中都会遇到此问题。由于任务艰巨,如果没有人工协助,很难找到准确的对齐方式。因此,开发能够最佳利用能够为少量节点提供正确对准的专家的网络对准算法具有非常重要的实践意义。但是,只有少数现有作品解决了该活动的网络对齐设置。现有的大多数主动方法都集中在绝对查询上(节点a和b是相同或注解),而我们认为,人类专家通常更容易回答相对查询(集合fb1中的哪个节点; :::;; bn g最类似于节点a)。本文介绍了两种新颖的相对查询策略,TopMatchings和GibbsMatchings,它们可以在构造和解决二分匹配问题的任何网络对齐方法的基础上应用。我们的方法通过对与网络对齐实例相关联的二部图进行匹配来识别要查询的信息最多的节点。我们将提出的方法与几种常用的查询策略进行比较,并对合成数据集和实际数据集进行实验。我们的基于抽样的策略可产生最高的整体性能,在某些情况下,其性能优于所有基准方法超过15个百分点。在准确性方面,TopMatchings和GibbsMatchings的性能相当。但是,GibbsMatchings具有更大的可伸缩性,但是它还需要针对温度参数进行超参数调整。

下载地址 | 返回目录 | [10.1145/3132847.3132983]

[2] An anomaly detection technique for business processes based on extended dynamic Bayesian networks (2019)

(Pauwels, Stephen | )

Abstract: Checking and analyzing various executions of different Business Processes can be a tedious task as the logs from these executions may contain lots of events, each with a (possibly large) number of attributes. We developed a way to automatically model the behavior captured in log files with dozens of attributes. The advantage of our method is that we do not need any prior knowledge about the data and the attributes. The learned model can then be used to detect anomalous executions in the data. To achieve this we extend the existing Dynamic Bayesian Networks with other (existing) techniques to better model the normal behavior found in log files. We introduce a new algorithm that is able to learn a model of a log file starting from the data itself. The model is capable of scoring events and cases, even when new values or new combinations of values appear in the log file, and has the ability to give a decomposition of the given score, indicating the root cause for the anomalies. Furthermore we show that our model can be used in a more general way for detecting Concept Drift.

摘要: 检查和分析不同业务流程的各种执行可能是一项繁琐的任务,因为来自这些执行的日志可能包含许多事件,每个事件都有(可能很大)数量的属性。我们开发了一种自动建模具有数十个属性的日志文件中捕获的行为的方法。该方法的优点是我们不需要有关数据和属性的任何先验知识。然后,可以将学习到的模型用于检测数据中的异常执行。为实现此目的,我们使用其他(现有)技术扩展了现有的动态贝叶斯网络,以更好地对日志文件中的正常行为进行建模。我们引入了一种新算法,该算法能够从数据本身开始学习日志文件的模型。即使在日志文件中出现新值或值的新组合时,该模型也能够对事件和案例进行评分,并且能够对给定的分数进行分解,从而指出异常的根本原因。此外,我们证明了我们的模型可以以更通用的方式用于检测概念漂移。

下载地址 | 返回目录 | [10.1145/3297280.3297326]

[3] Complex mapping discovery for semantic process model alignment (2010)

(Gater, Ahmed and Grigori, Daniela and Bouzeghoub, Mokrane | )

Abstract: With the growing importance of processes in current information systems and service oriented architectures, there is an increasing need for automatic techniques allowing to compare process models. Examples of such applications are numerous: delta analysis, version management, compatibility and replaceability analysis of business protocols, behavior based service discovery. When comparing two process models, first a mapping between their activities have to be found, identifying activities that are either equal or similar. Finding such a mapping is quiet complex and should take into account activities attributes (name, inputs/outputs), process structure and granularity differences that may exist in decomposing a given functionality. This paper presents an approach for automatic detection of corespondences between the activities of two semantic annotated process models (the annotations concern inputs/outputs of composing activities) that addresses these challenges. Specifically, the proposed technique is able to identify complex mappings (1-n) between activities and proposes also a process alignment (mapping between all activities) that takes into account process structure. Copyright 2010 ACM.

摘要: 随着流程在当前信息系统和面向服务的体系结构中的重要性日益提高,对允许比较流程模型的自动技术的需求也在不断增长。此类应用程序的例子很多:增量分析,版本管理,业务的兼容性和可替换性分析协议,基于行为的服务发现。在比较两个流程模型时,首先必须找到它们的活动之间的映射,以标识相等或相似的活动。查找这种映射非常复杂,应考虑活动的属性(名称,输入/输出),过程结构和在分解给定功能时可能存在的粒度差异本文提出了一种自动检测两个语义注释过程模型的活动之间的核心联系的方法(这些注释涉及组成活动的输入/输出)解决这些挑战的方法。相对的技术能够识别活动之间的复杂映射(1-n),并且还提出了考虑过程结构的过程对齐(所有活动之间的映射)。版权所有2010 ACM。

下载地址 | 返回目录 | [10.1145/1967486.1967537]

[4] Discriminative word alignment with conditional random fields (2006)

(Blunsom, Phil and Cohn, Trevor | )

Abstract: In this paper we present a novel approach for inducing word alignments from sentence aligned data. We use a Conditional Random Field (CRF), a discriminative model, which is estimated on a small supervised training set. The CRF is conditioned on both the source and target texts, and thus allows for the use of arbitrary and overlapping features over these data. Moreover, the CRF has efficient training and decoding processes which both find globally optimal solutions. We apply this alignment model to both French-English and Romanian-English language pairs. We show how a large number of highly predictive features can be easily incorporated into the CRF, and demonstrate that even with only a few hundred word-aligned training sentences, our model improves over the current state-ofthe- art with alignment error rates of 5.29 and 25.8 for the two tasks respectively. textcopyright 2006 Association for Computational Linguistics.

摘要: 在本文中,我们提出了一种从句子对齐数据中诱导单词对齐的新颖方法。我们使用条件随机场(CRF),这是一种判别模型,该模型是在小型监督训练集上估算的。 CRF以源文本和目标文本为条件,因此可以在这些数据上使用任意和重叠的功能。此外,CRF具有有效的训练和解码过程,它们都能找到全球最佳的解决方案。我们将此对齐模型应用于法语-英语和罗马尼亚语-英语对。我们展示了如何将大量具有高度预测性的功能轻松地整合到CRF中,并展示了即使仅使用几百个单词对齐的训练句子,我们的模型也比当前的最新技术有所改进,对齐错误率为5.29和25.8分别用于两个任务。 t​​extcopyright 2006年计算语言学协会。

下载地址 | 返回目录 | [10.3115/1220175.1220184]

[5] Multiple Sequence Alignment Using Tabu Search (2004)

(Riaz, Tariq and Wang, Yi and Li, Kuo-bin | )

Abstract: Tabu search is a meta-heuristic approach that is found to be useful in solving combinatorial optimization problems. We implement the adaptive memory features of tabu search to align multiple sequences. Adaptive memory helps the search process to avoid local optima and explores the solution space economically and effectively without getting trapped into cycles. The algorithm is further enhanced by introducing extended tabu search features such as intensification and diversification. It intensifies by bringing the search process to poorly aligned regions of an elite solution, and softly diversifies by moving from one poorly aligned region to another. The neighborhoods of a solution are generated stochastically and a consistency-based objective function is employed to measure its quality. The algorithm is tested with the datasets from BAliBASE benchmarking database. We have observed through experiments that for datasets comprising orphan sequences, divergent families and long internal insertions, tabu search generates better alignment as compared to other methods studied in this paper. The source code of our tabu search algorithm is available at http://www.bii.a-star.edu.sg/~tariq/tabu/.

摘要: Tabu搜索是一种元启发式方法,被发现在解决组合优化问题中很有用。我们实现禁忌搜索的自适应内存功能以比对多个序列。自适应内存可帮助搜索过程避免局部最优,并经济有效地探索解决方案空间,而不会陷入循环之中。通过引入扩展的禁忌搜索功能(如强化和多样化)进一步增强了该算法。通过将搜索过程引入精英解决方案的排列不正确的区域来增强它的功能,并通过从一个排列不正确的区域移动到另一个区域来软化多样化。解决方案的邻域是随机生成的,并使用基于一致性的目标函数来衡量其质量。使用BAliBASE基准数据库中的数据集对算法进行了测试。我们已经通过实验观察到,与本文研究的其他方法相比,对于包含孤立序列,不同家族和较长内部插入的数据集,禁忌搜索可产生更好的比对。我们的禁忌搜索算法的源代码可从http://www.bii.a-star.edu.sg/~tariq/tabu/获得。

下载地址 | 返回目录 | [10.5555/976520.976550]

[6] Temporal alignment using the incremental unit framework (2017)

(Kennington, Casey and Han, Ting and Schlangen, David | )

Abstract: We propose a method for temporal alignmentsa precondition of meaningful fusionsof multimodal systems, using the incremental unit dialogue system framework, which gives the system flexibility in how it handles alignment: either by delaying a modality for a specified amount of time, or by revoking (i.e., backtracking) processed information so multiple information sources can be processed jointly. We evaluate our approach in an offline experiment with multimodal data and find that using the incremental framework is flexible and shows promise as a solution to the problem of temporal alignment in multimodal systems.

摘要: 我们提出了一种用于时间对齐的方法,它是使用增量单元对话系统框架的多模态系统有意义融合的前提,它为系统处理对齐方式提供了系统灵活性:通过将模态延迟指定的时间量或通过取消(即回溯)已处理的信息,以便可以共同处理多个信息源。我们在具有多模式数据的离线实验中评估了我们的方法,发现使用增量框架是灵活的,并且显示了作为解决多模式系统中时间对齐问题的希望。

下载地址 | 返回目录 | [10.1145/3136755.3136769]

[7] Text alignment in the real world (1995)

(Davis, Mark W. and Dunning, Ted E. and Ogden, William C. | )

Abstract: Alignment methods based on byte-length comparisons of alignment blocks have been remarkably successful for aligning good translations from legislative transcriptions. For noisy translations in which the parallel text of a document has significant structural differences, byte-alignment methods often do not perform well. The Pan American Health Organization (PAHO) corpus is a series of articles that were first translated by machine methods and then improved by professional translators. Many of the Spanish PAHO texts do not share formatting conventions with the corresponding English documents , refer to tables in stylistically different ways and contain extraneous information. A method based on a dynamic programming framework, but using a decision criterion derived from a combination of byte-length ratio measures, hard matching of numbers, string comparisons and n-gram co-occurrence matching substantially improves the performance of the alignment process.

摘要: 基于对齐块的字节长度比较的对齐方法已经非常成功地用于对齐来自立法转录的良好翻译。对于嘈杂的翻译,其中文档的并行文本具有明显的结构差异,字节对齐方法通常效果不佳。泛美卫生组织(PAHO)语料库是一系列文章,这些文章首先通过机器方法翻译,然后由专业翻译人员加以改进。许多西班牙的PAHO文本与相应的英语文档没有共享格式约定,以样式上不同的方式引用表并包含无关的信息。一种基于动态编程框架的方法,但是使用从字节长比度量,数字的硬匹配,字符串比较和n-gram共现匹配的组合得出的决策标准,可以大大提高对齐过程的性能。

下载地址 | 返回目录 | [10.3115/976973.976984]

[8] Using normalized alignment scores to detect incorrectly aligned segments (2009)

(Turk, Andreas | )

Abstract: This paper introduces a number of quality metrics which can be used to automatically detect incorrectly aligned segment pairs. This is an important issue in commercial machine translation as segmentation and alignment of bilingual corpora is often performed by third parties whose quality assurances cannot always be relied upon. The metrics in this paper are based on the normalized logarithm of the alignment score of a segment pair, where the alignment score is calculated using an IBM translation model 4. The alignment quality metrics are evaluated in classification experiments on a Chinese-English patent translation task and are shown to yield satisfactory performance. Copyright 2009 ACM.

摘要: 本文介绍了许多质量指标,可用于自动检测不正确对齐的段对。这是商业机器翻译中的一个重要问题,因为双语语料库的分割和对齐通常是由第三方执行的,而第三方往往不能依赖其质量保证。本文中的度量标准基于片段对的对齐分数的归一化对数,其中对齐分数是使用IBM翻译模型4计算的。对齐质量度量标准是在汉英专利翻译任务的分类实验中评估的并且显示出令人满意的性能。版权所有2009 ACM。

下载地址 | 返回目录 | [10.1145/1651343.1651349]

[9] Alignment by Maximization of Mutual Information (1997)

(Viola, Paul and Wells, William M. | International Journal of Computer Vision)

Abstract: A new information-theoretic approach is presented for finding the pose of an object in an image. The technique does not require information about the surface properties of the object, besides its shape, and is robust with respect to variations of illumination. In our derivation few assumptions are made about the nature of the imaging process. As a result the algorithms are quite general and may foreseeably be used in a wide variety of imaging situations. Experiments are presented that demonstrate the approach registering magnetic resonance (MR) images, aligning a complex 3D object model to real scenes including clutter and occlusion, tracking a human head in a video sequence and aligning a view-based 2D object model to real images. The method is based on a formulation of the mutual information between the model and the image. As applied here the technique is intensity-based, rather than feature-based. It works well in domains where edge or gradient-magnitude based methods have difficulty, yet it is more robust than traditional correlation. Additionally, it has an efficient implementation that is based on stochastic approximation.

摘要: 提出了一种新的信息理论方法,用于查找图像中物体的姿势。该技术除其形状外,不需要有关对象表面特性的信息,并且相对于照明变化具有鲁棒性。在我们的推导中,几乎没有关于成像过程的性质的假设。结果,该算法是相当通用的,并且可以预见地可以在各种各样的成像情况下使用。提出了实验,这些实验演示了以下方法:配准磁共振(MR)图像,将复杂的3D对象模型与包括混乱和遮挡的真实场景对齐,在视频序列中跟踪人的头部以及将基于视图的2D对象模型与真实图像对齐。该方法基于模型和图像之间的相互信息的表述。如此处所应用,该技术基于强度,而不是基于特征。它在基于边缘或梯度幅度方法难以解决的领域中效果很好,但是比传统的相关性更强大。此外,它具有基于随机近似的有效实现。

下载地址 | 返回目录 | [10.1023/A:1007958904918]

[10] An anomaly detection technique for business processes based on extended dynamic Bayesian networks (2019)

(Pauwels, Stephen | Proceedings of the ACM Symposium on Applied Computing)

Abstract: Checking and analyzing various executions of different Business Processes can be a tedious task as the logs from these executions may contain lots of events, each with a (possibly large) number of attributes. We developed a way to automatically model the behavior captured in log files with dozens of attributes. The advantage of our method is that we do not need any prior knowledge about the data and the attributes. The learned model can then be used to detect anomalous executions in the data. To achieve this we extend the existing Dynamic Bayesian Networks with other (existing) techniques to better model the normal behavior found in log files. We introduce a new algorithm that is able to learn a model of a log file starting from the data itself. The model is capable of scoring events and cases, even when new values or new combinations of values appear in the log file, and has the ability to give a decomposition of the given score, indicating the root cause for the anomalies. Furthermore we show that our model can be used in a more general way for detecting Concept Drift.

摘要: 检查和分析不同业务流程的各种执行可能是一项繁琐的任务,因为来自这些执行的日志可能包含许多事件,每个事件都有(可能很大)数量的属性。我们开发了一种自动建模具有数十个属性的日志文件中捕获的行为的方法。该方法的优点是我们不需要有关数据和属性的任何先验知识。然后,可以将学习到的模型用于检测数据中的异常执行。为实现此目的,我们使用其他(现有)技术扩展了现有的动态贝叶斯网络,以更好地对日志文件中的正常行为进行建模。我们引入了一种新算法,该算法能够从数据本身开始学习日志文件的模型。即使在日志文件中出现新值或值的新组合时,该模型也能够对事件和案例进行评分,并且能够对给定的分数进行分解,从而指出异常的根本原因。此外,我们证明了我们的模型可以以更通用的方式用于检测概念漂移。

下载地址 | 返回目录 | [10.1145/3297280.3297326]

[11] Complex mapping discovery for semantic process model alignment (2010)

(Gater, Ahmed and Grigori, Daniela and Bouzeghoub, Mokrane | iiWAS2010 - 12th International Conference on Information Integration and Web-Based Applications and Services)

Abstract: With the growing importance of processes in current information systems and service oriented architectures, there is an increasing need for automatic techniques allowing to compare process models. Examples of such applications are numerous: delta analysis, version management, compatibility and replaceability analysis of business protocols, behavior based service discovery. When comparing two process models, first a mapping between their activities have to be found, identifying activities that are either equal or similar. Finding such a mapping is quiet complex and should take into account activities attributes (name, inputs/outputs), process structure and granularity differences that may exist in decomposing a given functionality. This paper presents an approach for automatic detection of corespondences between the activities of two semantic annotated process models (the annotations concern inputs/outputs of composing activities) that addresses these challenges. Specifically, the proposed technique is able to identify complex mappings (1-n) between activities and proposes also a process alignment (mapping between all activities) that takes into account process structure. Copyright 2010 ACM.

摘要: 随着流程在当前信息系统和面向服务的体系结构中的重要性日益提高,对允许比较流程模型的自动技术的需求也在不断增长。此类应用程序的例子很多:增量分析,版本管理,业务的兼容性和可替换性分析协议,基于行为的服务发现。在比较两个流程模型时,首先必须找到它们的活动之间的映射,以标识相等或相似的活动。查找这种映射非常复杂,应考虑活动的属性(名称,输入/输出),过程结构和在分解给定功能时可能存在的粒度差异本文提出了一种自动检测两个语义注释过程模型的活动之间的核心联系的方法(这些注释涉及组成活动的输入/输出)解决这些挑战的方法。相对的技术能够识别活动之间的复杂映射(1-n),并且还提出了考虑过程结构的过程对齐(所有活动之间的映射)。版权所有2010 ACM。

下载地址 | 返回目录 | [10.1145/1967486.1967537]

[12] Data-aware process mining: Discovering decisions in processes using alignments (2013)

(De Leoni | Proceedings of the ACM Symposium on Applied Computing)

Abstract: Process discovery, i.e., learning process models from event logs, has attracted the attention of researchers and practitioners. Today, there exists a wide variety of process mining techniques that are able to discover the control-flow of a process based on event data. These techniques are able to identify decision points, but do not analyze data flow to find rules explaining why individual cases take a particular path. Fortunately, recent advances in conformance checking can be used to align an event log with data and a process model with decision points. These alignments can be used to generate a well-defined classification problem per decision point. This way data flow and guards can be discovered and added to the process model. Copyright 2013 ACM.

摘要: 过程发现,即从事件日志中学习过程模型,吸引了研究人员和从业人员的注意力。如今,存在各种各样的过程挖掘技术,它们能够基于事件数据发现过程的控制流。这些技术能够识别决策点,但无法分析数据流以找到解释个别案例为何采用特定路径的规则。幸运的是,一致性检查的最新进展可用于将事件日志与数据对齐,并将流程模型与决策点对齐。这些比对可用于为每个决策点生成明确定义的分类问题。这样,可以发现数据流和保护措施并将其添加到流程模型中。版权所有2013 ACM。

下载地址 | 返回目录 | [10.1145/2480362.2480633]

[13] Data-aware process mining: Discovering decisions in processes using alignments (2013)

(De Leoni | Proceedings of the ACM Symposium on Applied Computing)

Abstract: Process discovery, i.e., learning process models from event logs, has attracted the attention of researchers and practitioners. Today, there exists a wide variety of process mining techniques that are able to discover the control-flow of a process based on event data. These techniques are able to identify decision points, but do not analyze data flow to find rules explaining why individual cases take a particular path. Fortunately, recent advances in conformance checking can be used to align an event log with data and a process model with decision points. These alignments can be used to generate a well-defined classification problem per decision point. This way data flow and guards can be discovered and added to the process model. Copyright 2013 ACM.

摘要: 过程发现,即从事件日志中学习过程模型,吸引了研究人员和从业人员的注意力。如今,存在各种各样的过程挖掘技术,它们能够基于事件数据发现过程的控制流。这些技术能够识别决策点,但无法分析数据流以找到解释个别案例为何采用特定路径的规则。幸运的是,一致性检查的最新进展可用于将事件日志与数据对齐,并将流程模型与决策点对齐。这些比对可用于为每个决策点生成明确定义的分类问题。这样,可以发现数据流和保护措施并将其添加到流程模型中。版权所有2013 ACM。

下载地址 | 返回目录 | [10.1145/2480362.2480633]

[14] Discriminative word alignment via alignment matrix modeling (2008)

(Niehues, Jan and Vogel, Stephan | )

Abstract: In this paper a new discriminative word align- ment method is presented. This approach models directly the alignment matrix by a con- ditional random field (CRF) and so no restric- tions to the alignments have to be made. Fur- thermore, it is easy to add features and so all available information can be used. Since the structure of the CRFs can get complex, the in- ference can only be done approximately and the standard algorithms had to be adapted. In addition, different methods to train the model have been developed. Using this approach the alignment quality could be improved by up to 23 percent for 3 different language pairs compared to a combination of both IBM4- alignments. Furthermore the word alignment was used to generate new phrase tables. These could improve the translation quality signifi- cantly.

摘要: 本文提出了一种新的判别词对齐方法。这种方法通过条件随机场(CRF)直接对比对矩阵进行建模,因此无需对比对进行任何限制。此外,很容易添加功能,因此可以使用所有可用信息。由于CRF的结构可能变得复杂,因此只能近似地进行推断,并且必须调整标准算法。另外,已经开发了训练模型的不同方法。使用这种方法,与两种IBM4-对齐方式的组合相比,可以将3种不同语言对的对齐方式质量提高多达23%。此外,单词对齐用于生成新的短语表。这些可以显着提高翻译质量。

下载地址 | 返回目录 | [10.3115/1626394.1626397]

[15] Measuring the alignment between business processes and software systems: A case study (2010)

(Aversano, Lerina and Grasso, Carmine and Tortorella, Maria | Proceedings of the ACM Symposium on Applied Computing)

Abstract: The alignment degree existing between a business process and the supporting software systems strongly affects the performance of the business process execution. Methods are needed for detecting this kind of alignment and keeping a business process aligned with a supporting software system even when one of the two evolves. Actually, any modification performed in the business process activities and/or supporting software systems may impact the process activities and/or software components, in terms of input/output and/or purpose and, therefore, cause misalignment. This paper proposes a framework including a set of metrics codifying the alignment concept with the aim of measuring it and detecting misalignment if it occurs. The application of the framework is explored through a case study. textcopyright 2010 ACM.

摘要: 业务流程与支持软件系统之间存在的一致性程度强烈影响业务流程执行的性能。需要一种方法来检测这种对齐方式,并使业务流程与支持软件系统保持对齐,即使两者中的一种进化了。实际上,在业务流程活动和/或支持软件系统中执行的任何修改都可能在输入/输出和/或目的方面影响流程活动和/或软件组件,并因此导致不一致。本文提出了一个框架,其中包括一组度量标准,该度量标准对对齐概念进行了整理,旨在对其进行度量并检测是否发生对齐错误。通过案例研究探索了该框架的应用。 t​​extcopyright 2010 ACM。

下载地址 | 返回目录 | [10.1145/1774088.1774570]

[16] Pairwise sequence alignment algorithms - A survey (2009)

(Haque, Waqar and Aravind, Alex and Reddy, Bharath | Proceedings of the 2009 Conference on Information Science, Technology and Applications, ISTA 09)

Abstract: Pairwise sequence alignment is a fundamental compute-intensive problem in bioinformatics that has helped researchers analyse biological sequences. The analysis has helped biologists detect pathogens, develop drugs, and identify common genes. The biological sequence database has been growing rapidly due to new sequences being discovered. This has brought many new challenges including sequence database searching and aligning long sequences. To solve these problems, many sequence alignment algorithms have been developed. These algorithms employ various techniques to efficiently find optimal or nearly-optimal alignments. In this paper, we present the popular past and recent work on both local and global pairwise sequence alignment algorithms. In addition to identifying the techniques used, the advantages and limitations of the algorithms are also presented. Copyright 2009 ACM.

摘要: 逐对序列比对是生物信息学中一个基本的计算密集型问题,已帮助研究人员分析生物序列。该分析帮助生物学家发现病原体,开发药物并鉴定常见基因。由于发现了新序列,生物序列数据库已迅速发展。这带来了许多新的挑战,包括序列数据库搜索和长序列比对。为了解决这些问题,已经开发了许多序列比对算法。这些算法采用各种技术来有效地找到最佳或接近最佳的比对。在本文中,我们介绍了本地和全局成对序列比对算法的流行的过去和最近的工作。除了确定所使用的技术外,还介绍了算法的优点和局限性。版权所有2009 ACM。

下载地址 | 返回目录 | [10.1145/1551950.1551980]

[17] Process mining for clinical processes: A comparative analysis of four australian hospitals (2015)

(Partington, Andrew and Wynn, Moe and Suriadi, Suriadi and Ouyang, Chun and Karnon, Jonathan | ACM Transactions on Management Information Systems)

Abstract: Business process analysis and process mining, particularly within the health care domain, remain underutilized. Applied research that employs such techniques to routinely collected health care data enables stakeholders to empirically investigate care as it is delivered by different health providers. However, crossorganizational mining and the comparative analysis of processes present a set of unique challenges in terms of ensuring population and activity comparability, visualizing the mined models, and interpreting the results. Without addressing these issues, health providers will find it difficult to use process mining insights, and the potential benefits of evidence-based process improvement within health will remain unrealized. In this article, we present a brief introduction on the nature of health care processes, a review of process mining in health literature, and a case study conducted to explore and learn how health care data and crossorganizational comparisons with process-mining techniques may be approached. The case study applies process-mining techniques to administrative and clinical data for patients who present with chest pain symptoms at one of four public hospitals in South Australia. We demonstrate an approach that provides detailed insights into clinical (quality of patient health) and fiscal (hospital budget) pressures in the delivery of health care. We conclude by discussing the key lessons learned from our experience in conducting business process analysis and process mining based on the data from four different hospitals.

摘要: 业务流程分析和流程挖掘,尤其是在医疗保健领域,仍未得到充分利用。应用此类技术定期收集医疗保健数据的应用研究使利益相关者能够对由不同医疗提供者提供的医疗保健进行经验调查。但是,跨组织挖掘和过程的比较分析在确保人口和活动可比性,可视化挖掘的模型以及解释结果方面提出了一系列独特的挑战。如果不解决这些问题,健康提供者将很难使用过程挖掘的见解,并且在健康中基于证据的过程改进的潜在好处将仍然无法实现。在本文中,我们对卫生保健过程的性质进行了简要介绍,对卫生文献中的过程挖掘进行了回顾,并进行了案例研究,以探索和学习如何利用过程挖掘技术来获取卫生保健数据和跨组织比较。该案例研究将过程挖掘技术应用于在南澳大利亚的四家公立医院之一中出现胸痛症状的患者的管理和临床数据。我们演示了一种方法,该方法可提供有关提供医疗服务的临床(患者健康质量)和财政(医院预算)压力的详细信息。最后,我们将讨论从我们在基于四家不同医院的数据进行业务流程分析和流程挖掘的经验中学到的关键教训。

下载地址 | 返回目录 | [10.1145/2629446]

[18] Process mining for clinical processes: A comparative analysis of four australian hospitals (2015)

(Partington, Andrew and Wynn, Moe and Suriadi, Suriadi and Ouyang, Chun and Karnon, Jonathan | ACM Transactions on Management Information Systems)

Abstract: Business process analysis and process mining, particularly within the health care domain, remain underutilized. Applied research that employs such techniques to routinely collected health care data enables stakeholders to empirically investigate care as it is delivered by different health providers. However, crossorganizational mining and the comparative analysis of processes present a set of unique challenges in terms of ensuring population and activity comparability, visualizing the mined models, and interpreting the results. Without addressing these issues, health providers will find it difficult to use process mining insights, and the potential benefits of evidence-based process improvement within health will remain unrealized. In this article, we present a brief introduction on the nature of health care processes, a review of process mining in health literature, and a case study conducted to explore and learn how health care data and crossorganizational comparisons with process-mining techniques may be approached. The case study applies process-mining techniques to administrative and clinical data for patients who present with chest pain symptoms at one of four public hospitals in South Australia. We demonstrate an approach that provides detailed insights into clinical (quality of patient health) and fiscal (hospital budget) pressures in the delivery of health care. We conclude by discussing the key lessons learned from our experience in conducting business process analysis and process mining based on the data from four different hospitals.

摘要: 业务流程分析和流程挖掘,尤其是在医疗保健领域,仍未得到充分利用。应用此类技术定期收集医疗保健数据的应用研究使利益相关者能够对由不同医疗提供者提供的医疗保健进行经验调查。但是,跨组织挖掘和过程的比较分析在确保人口和活动可比性,可视化挖掘的模型以及解释结果方面提出了一系列独特的挑战。如果不解决这些问题,健康提供者将很难使用过程挖掘的见解,并且在健康中基于证据的过程改进的潜在好处将仍然无法实现。在本文中,我们对卫生保健过程的性质进行了简要介绍,对卫生文献中的过程挖掘进行了回顾,并进行了案例研究,以探索和学习如何利用过程挖掘技术来获取卫生保健数据和跨组织比较。该案例研究将过程挖掘技术应用于在南澳大利亚的四家公立医院之一中出现胸痛症状的患者的管理和临床数据。我们演示了一种方法,该方法可提供有关提供医疗服务的临床(患者健康质量)和财政(医院预算)压力的详细信息。最后,我们将讨论从我们在基于四家不同医院的数据进行业务流程分析和流程挖掘的经验中学到的关键教训。

下载地址 | 返回目录 | [10.1145/2629446]

[19] Semantic E-workflow composition (2003)

(Cardoso, Jorge and Sheth, Amit | Journal of Intelligent Information Systems)

Abstract: Systems and infrastructures are currently being developed to support Web services. The main idea is to encapsulate an organizations functionality within an appropriate interface and advertise it as Web services. While in some cases Web services may be utilized in an isolated form, it is normal to expect Web services to be integrated as part of workflow processes. The composition of workflow processes that model e-service applications differs from the design of traditional workflows, in terms of the number of tasks (Web services) available to the composition process, in their heterogeneity, and in their autonomy. Therefore, two problems need to be solved: how to efficiently discover Web services - based on functional and operational requirements - and how to facilitate the interoperability of heterogeneous Web services. In this paper, we present a solution within the context of the emerging Semantic Web that includes use of ontologies to overcome some of the problem. We describe a prototype that has been implemented to illustrate how discovery and interoperability functions are achieved more efficiently.

摘要: 目前正在开发支持Web服务的系统和基础结构。主要思想是将组织的功能封装在适当的接口内,并将其作为Web服务发布。尽管在某些情况下,Web服务可能以隔离的形式使用,但是期望将Web服务作为工作流流程的一部分进行集成。对电子服务应用程序进行建模的工作流流程的组成与传统工作流的设计不同,就组成过程可用的任务(Web服务)的数量而言,因此,需要解决两个问题:如何根据功能和操作要求有效地发现Web服务,以及如何促进异构We​​b服务的互操作性。新兴语义Web上下文中的解决方案,其中包括使用本体来克服某些问题。用来说明如何更有效地实现发现和互操作性功能的耳标。

下载地址 | 返回目录 | [10.1023/A:1025542915514]

[20] Triangular Alignment TAME: A Tensor-Based Approach for Higher-Order Network Alignment (2017)

(Mohammadi, Shahin and Gleich, David F. and Kolda, Tamara G. and Grama, Ananth | IEEE/ACM Transactions on Computational Biology and Bioinformatics)

Abstract: Network alignment has extensive applications in comparative interactomics. Traditional approaches aim to simultaneously maximize the number of conserved edges and the underlying similarity of aligned entities. We propose a novel formulation of the network alignment problem that extends topological similarity to higher-order structures and provides a new objective function that maximizes the number of aligned substructures. This objective function corresponds to an integer programming problem, which is NP-hard. Consequently, we identify a closely related surrogate function whose maximization results in a tensor eigenvector problem. Based on this formulation, we present an algorithm called Triangular AlignMEnt TAME, which attempts to maximize the number of aligned triangles across networks. Using a case study on the NAPAbench dataset, we show that triangular alignment is capable of producing mappings with high node correctness. We further evaluate our method by aligning yeast and human interactomes. Our results indicate that TAME outperforms the state-of-art alignment methods in terms of conserved triangles. In addition, we show that the number of conserved triangles is more significantly correlated, compared to the conserved edge, with node correctness and co-expression of edges. Our formulation and resulting algorithms can be easily extended to arbitrary motifs.

摘要: 网络对齐在比较交互组学中有广泛的应用。传统方法旨在同时最大化保守边缘的数量和对齐实体的潜在相似性。我们提出了一种网络对齐问题的新方案,该问题将拓扑相似性扩展到高阶结构,并提供了一个新的目标函数,该函数最大化了对齐的子结构的数量。该目标函数对应于一个整数编程问题,它是NP难的。因此,我们确定了一个密切相关的替代函数,其最大化导致张量特征向量问题。基于此公式,我们提出了一种称为Triangleular AlignMEnt TAME的算法,该算法试图最大化网络中对齐三角形的数量。使用对NAPAbench数据集的案例研究,我们表明三角形对齐能够生成具有高节点正确性的映射。我们通过对齐酵母和人类相互作用组进一步评估我们的方法。我们的结果表明,就保守三角形而言,TAME优于最新的对准方法。此外,我们表明,与保守边缘相比,保守三角形的数量与节点正确性和边缘的共表达之间的相关性更高。我们的公式和生成的算法可以轻松扩展到任意图案。

下载地址 | 返回目录 | [10.1109/TCBB.2016.2595583]

[21] Using normalized alignment scores to detect incorrectly aligned segments (2009)

(Turk, Andreas | International Conference on Information and Knowledge Management, Proceedings)

Abstract: This paper introduces a number of quality metrics which can be used to automatically detect incorrectly aligned segment pairs. This is an important issue in commercial machine translation as segmentation and alignment of bilingual corpora is often performed by third parties whose quality assurances cannot always be relied upon. The metrics in this paper are based on the normalized logarithm of the alignment score of a segment pair, where the alignment score is calculated using an IBM translation model 4. The alignment quality metrics are evaluated in classification experiments on a Chinese-English patent translation task and are shown to yield satisfactory performance. Copyright 2009 ACM.

摘要: 本文介绍了许多质量指标,可用于自动检测不正确对齐的段对。这是商业机器翻译中的一个重要问题,因为双语语料库的分割和对齐通常是由第三方执行的,而第三方往往不能依赖其质量保证。本文中的度量标准基于片段对的对齐分数的归一化对数,其中对齐分数是使用IBM翻译模型4计算的。对齐质量度量标准是在汉英专利翻译任务的分类实验中评估的并且显示出令人满意的性能。版权所有2009 ACM。

下载地址 | 返回目录 | [10.1145/1651343.1651349]


中文摘要仅供参考