分享
 
 
 

探索C#之微型MapReduce

王朝学院·作者佚名  2016-05-20
窄屏简体版  字體: |||超大  

探索C#之微型MaPReduce 2015-05-22 01:06 by 蘑菇先生, ... 阅读, ... 评论, 收藏, 编辑 MapReduce近几年比较热的分布式计算编程模型,以C#为例简单介绍下MapReduce分布式计算。

阅读目录

背景 Map实现 Reduce实现支持分布式总结背景某平行世界程序猿小张接到Boss一项任务,统计用户反馈内容中的单词出现次数,以便分析用户主要习惯。文本如下:

const string hamlet = @"Though yet of Hamlet our dear brother's deathThe memory be green, and that it us befittedTo bear our hearts in grief and our whole kingdomTo be contracted in one brow of woe,Yet so far hath discretion fought with natureThat we with wisest sorrow think on him,Together with remembrance of ourselves.Therefore our sometime sister, now our queen,The imperial jointress to this warlike state,Have we, as 'twere with a defeated joy,--With an auspicious and a dropping eye,With mirth in funeral and with dirge in marriage,In equal scale weighing delight and dole,--Taken to wife: nor have we herein barr'dYour better wisdoms, which have freely goneWith this affair along. For all, our thanks.Now follows, that you know, young Fortinbras,Holding a weak supposal of our worth,Or thinking by our late dear brother's deathOur state to be disjoint and out of frame,Colleagued with the dream of his advantage,He hath not fail'd to pester us with message,Importing the surrender of those landsLost by his father, with all bonds of law,To our most valiant brother. So much for him.Now for ourself and for this time of meeting:Thus much the business is: we have here writTo Norway, uncle of young Fortinbras,--Who, impotent and bed-rid, scarcely hearsOf this his nephew's purpose,--to suppressHis further gait herein; in that the levies,The lists and full proportions, are all madeOut of his subject: and we here dispatchYou, good Cornelius, and you, Voltimand,For bearers of this greeting to old Norway;Giving to you no further personal powerTo business with the king, more than the scopeOf these delated articles allow.Farewell, and let your haste commend your duty.";

View Code小张作为蓝翔高材生,很快就实现了:

var content = hamlet.Split(new[] { " ", Environment.NewLine }, StringSplitOptions.RemoveEmptyEntries); var Wordcount=new Dictionary<string,int>(); foreach (var item in content) { if (wordcount.ContainsKey(item)) wordcount[item] += 1; else wordcount.Add(item, 1); }

作为有上进心的青年,小张决心对算法进行抽象封装,并支持多节点计算。小张把这个统计次数程序分成两个大步骤:分解和计算。第一步:先把文本以某维度分解映射成最小独立单元。 (段落、单词、字母维度)。第二部:把最小单元重复的做合并计算。小张参考MapReduce论文设计Map、Reduce如下:

Map实现MappingMapping函数把文本分解映射key,value形式的最小单元,即<单词,出现次数(1)>、<word,1>。

public IEnumerable<Tuple<T, int>> Mapping(IEnumerable<T> list) { foreach (T sourceVal in list) yield return Tuple.Create(sourceVal, 1); }

使用,输出为(brow, 1), (brow, 1), (sorrow, 1), (sorrow, 1):

var spit = hamlet.Split(new[] { " ", Environment.NewLine }, StringSplitOptions.RemoveEmptyEntries); var mp = new MicroMapReduce<string>(new Master<string>()); var result= mp.Mapping(spit);

Combine为了减少数据通信开销,mapping出的键值对数据在进入真正的reduce前,进行重复键合并。也相对于提前进行预计算一部分,加快总体计算速度。 输出格式为(brow, 2), (sorrow, 2):

public Dictionary<T, int> Combine(IEnumerable<Tuple<T, int>> list) { Dictionary<T, int> dt = new Dictionary<T, int>(); foreach (var val in list) { if (dt.ContainsKey(val.Item1)) dt[val.Item1] += val.Item2; else dt.Add(val.Item1, val.Item2); } return dt; }

View CodePartitionerPartitioner主要用来分组划分,把不同节点的统计数据按照key进行分组。其输出格式为: (brow, {(brow,2)},(brow,3)), (sorrow, {(sorrow,10)},(brow,11)):

public IEnumerable<Group<T, int>> Partitioner(Dictionary<T, int> list) { var dict = new Dictionary<T, Group<T, int>>(); foreach (var val in list) { if (!dict.ContainsKey(val.Key)) dict[val.Key] = new Group<T, int>(val.Key); dict[val.Key].Values.Add(val.Value); } return dict.Values; }

View CodeGroup定义:

public class Group<TKey, TValue> : Tuple<TKey, List<TValue>> { public Group(TKey key) : base(key, new List<TValue>()) { } public TKey Key { get { return base.Item1; } } public List<TValue> Values { get { return base.Item2; } } }

View CodeReduce实现Reducing函数接收,分组后的数据进行最后的统计计算。

public Dictionary<T, int> Reducing(IEnumerable<Group<T, int>> groups) { Dictionary<T, int> result=new Dictionary<T, int>(); foreach (var sourceVal in groups) { result.Add(sourceVal.Key, sourceVal.Values.Sum()); } return result; }

View Code封装调用如下:

public IEnumerable<Group<T, int>> Map(IEnumerable<T> list) { var step1 = Mapping(list); var step2 = Combine(step1); var step3 = Partitioner(step2); return step3; } public Dictionary<T, int> Reduce(IEnumerable<Group<T, int>> groups) { var step1 = Reducing(groups); return step1; }

View Code public Dictionar

 
 
 
免责声明:本文为网络用户发布,其观点仅代表作者个人观点,与本站无关,本站仅提供信息存储服务。文中陈述内容未经本站证实,其真实性、完整性、及时性本站不作任何保证或承诺,请读者仅作参考,并请自行核实相关内容。
2023年上半年GDP全球前十五强
 百态   2023-10-24
美众议院议长启动对拜登的弹劾调查
 百态   2023-09-13
上海、济南、武汉等多地出现不明坠落物
 探索   2023-09-06
印度或要将国名改为“巴拉特”
 百态   2023-09-06
男子为女友送行,买票不登机被捕
 百态   2023-08-20
手机地震预警功能怎么开?
 干货   2023-08-06
女子4年卖2套房花700多万做美容:不但没变美脸,面部还出现变形
 百态   2023-08-04
住户一楼被水淹 还冲来8头猪
 百态   2023-07-31
女子体内爬出大量瓜子状活虫
 百态   2023-07-25
地球连续35年收到神秘规律性信号,网友:不要回答!
 探索   2023-07-21
全球镓价格本周大涨27%
 探索   2023-07-09
钱都流向了那些不缺钱的人,苦都留给了能吃苦的人
 探索   2023-07-02
倩女手游刀客魅者强控制(强混乱强眩晕强睡眠)和对应控制抗性的关系
 百态   2020-08-20
美国5月9日最新疫情:美国确诊人数突破131万
 百态   2020-05-09
荷兰政府宣布将集体辞职
 干货   2020-04-30
倩女幽魂手游师徒任务情义春秋猜成语答案逍遥观:鹏程万里
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案神机营:射石饮羽
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案昆仑山:拔刀相助
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案天工阁:鬼斧神工
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案丝路古道:单枪匹马
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案镇郊荒野:与虎谋皮
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案镇郊荒野:李代桃僵
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案镇郊荒野:指鹿为马
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案金陵:小鸟依人
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案金陵:千金买邻
 干货   2019-11-12
 
推荐阅读
 
 
 
>>返回首頁<<
靜靜地坐在廢墟上,四周的荒凉一望無際,忽然覺得,淒涼也很美
© 2005- 王朝網路 版權所有