Interesting task for programmers

D
Site user since 18.12.2015
Offline
142
19

In general, long-thought to make a theme with puzzles to programmers could pump your skills and do not rust.

There is a file with the following content ( link to the complete file):


[ {
"Id": 47704,
"Scores": 0.7003193510956659
}
{
"Id": 9811,
"Scores": 0.7147695018808078
}
]

Need to write the stream decoder that will parse a file filter objects, where scores> 0.7, and then burn these objects in the next file (also streamed). Download the entire file completely and unpack the array can not. Change the structure of the file from the task impossible. The output format is the same JSON. The final sorting is not important. Language is not important.

Разработка и поддержка высоконагруженных проектов.
onep
Site user since 30.09.2019
Offline
18
#1

Not sure what the right decision. Just the top support)


$ .Ajax ({
url: "//0x.com.ua/task-1.json",
dataType: "json",
success: function (e) {
t ($. map (e, function (e) {
if (e.scores> 0.7) {
return {
id: e.id,
scores: e.scores
}
}
}))
}
})
T7
Site user since 19.09.2018
Offline
31
#2

 <? Php 
header ( 'content-type: text / plain');
$ Handle = fopen ( '/ var / web / aio / data / json', 'cb +');
$ Str = ''; $ Read = false;
$ T0 = microtime (true);
$ I = 0;
$ Ii = 0;
while (! feof ($ handle)) {
$ C = fread ($ handle, 1);
if ($ c == '{') {
$ Read = true;
}
if ($ read) {
. $ Str = $ c;
}
if ($ c == '}') {
$ A = json_decode ($ str);
$ I ++;
if (is_object ($ a) && $ a-> scores> 0.7) {

$ Ii ++;
$ A-> allobjcnt = $ i;
$ A-> matchedobjcnt = $ ii;
print_r ($ a);
}
$ Str = ''; $ Read = false;
}
}
echo sprintf ( "Timing:% 0000001.4f sec", (microtime (true) - $ t0)), "
################################################## ####
# / Ru / forum / 1032923
################################################## ####


";
fclose ($ handle);
?>

Pro record in the results file is not immediately noticed.

But there is easy, if a blank simply write "[{json_encode found}]"

then go to the end, we find the position of the "]" and from that position we finish ", {json_encode found}]"

 $ P = fseek ($ handle, -1, SEEK_END); 
$ C = ''; $ Pos = ftell ($ handle);
while ($ c! = ']') {
$ C = fread ($ handle, 1);
$ Pos- = 2;
fseek ($ handle, $ pos);
}
S3
Site user since 29.03.2012
Offline
212
#3


import json
import time

start = time.time ()
with open ( 'task-1.json') as data:
f = json.load (data)
print (len (f))
with open ( 'task-1.json') as data:
content = data.read ()
row = json.loads (content)
with open ( 'res-task-1.json', 'w') as f:
f.write ( '[') for item in row:
if item.get ( 'scores')> = 0.7:
f.write (json.dumps (item))
f.write ( ',')
f.write ( ']')
print (time.time () - start)

two lines then the extra - Only for counting time

an average of 0.03 seconds is performed on the core i5 quad

It was 10,000 lines in 2995 was

T7
Site user since 19.09.2018
Offline
31
#4
Sly32:
content = data.read ()

The problem is forbidden to ship the entire file

Danforth:
Download the entire file completely and unpack the array can not.
Z0
Site user since 03.09.2009
Offline
730
#5

I asked you a quest, but do not want palitsya 🤪 But there purely algorithm, but can not fold it up to now, weak little minds: p

S3
Site user since 29.03.2012
Offline
212
#6
timo-71:
The problem is forbidden to ship the entire file

Yes, wrong in this case, my decision

D
Site user since 18.12.2015
Offline
142
#7

Sly32, the output is not valid json :) Yes and subtract you all to memory.

My decision to Go, as usual - the longest https://play.golang.org/p/gWg-gqR-Xvi

 file filtered successfully, took 22.997923ms 
S3
Site user since 29.03.2012
Offline
212
#8
Danforth:
Sly32, the output is not valid json Yes and subtract you all to memory.

Yes, I see, but have not yet found a beautiful solution. I do not like the idea of reading character by character file hands) with the validation like I know how to solve. While rummaging, that there is in python for such cases. The code that prinitspe of that php is approximately the same length will be

Gerga
Site user since 02.08.2015
Offline
89
#9
Sly32:
The code that prinitspe of that php is approximately the same length will be

On php it can be even shorter if you do not read character by character.


define ( 'MINIMUM', 0.7);
define ( 'DELIMER', '}');

$ Filepath_in = 'task-1.json';
$ Filepath_out = 'task-1-out.json';

$ B = '';

$ Handle_in = fopen ($ filepath_in, 'r');

file_put_contents ($ filepath_out, '['); while (! feof ($ handle_in)) {
$ Line = stream_get_line ($ handle_in, 100, DELIMER);

$ R = preg_match ( '/ {(*) ([0-9.] +) $ /.?', $ Line, $ matches);

if ($ r && MINIMUM <$ matches [2]) {
$ Line = $ b. $ Matches [0]. DELIMER;
file_put_contents ($ filepath_out, $ line, FILE_APPEND);
$ B = ',';
}
}

file_put_contents ($ filepath_out, ']', FILE_APPEND);

fclose ($ handle_in);
VoV@
Site user since 22.09.2007
Offline
196
#10

That's decided to simple search:


using Newtonsoft.Json;
using System;
using System.Collections.Generic;
using System.IO;
using System.Text;

namespace searchengines
{
class Program
{

static void Main (string [] args)
{
var _sourceArrayItem = new List <char> ();
var canRead = false;
var firstAppend = true;

using var sr = new StreamReader ($ "{Environment.CurrentDirectory} \\ task-1.json");
using var sw = new FileStream ($ "{Environment.CurrentDirectory} \\ result.json", FileMode.OpenOrCreate);

var terminalSymbol = Encoding.Default.GetBytes ( "["); sw.Write (terminalSymbol, 0, terminalSymbol.Length);

while (sr.Peek ()> = 0)
{
var symbol = (char) sr.Read ();
if (symbol == '{' || canRead)
{
_sourceArrayItem.Add (symbol);
canRead = true;
}
if (symbol == '}')
{
var item = new string (_sourceArrayItem.ToArray ());
var parsedItem = JsonConvert.DeserializeObject <TaskItem> (item);

if (parsedItem.Scores> 0.7)
{
item = firstAppend? item: $ ", {item}";
var buffer = Encoding.Default.GetBytes (item);
sw.Write (buffer, 0, buffer.Length);
sw.Flush ();

firstAppend = false;
}

_sourceArrayItem = new List <char> ();
canRead = false;
}
}

terminalSymbol = Encoding.Default.GetBytes ( "]");
sw.Write (terminalSymbol, 0, terminalSymbol.Length);
sw.Close ();
}

public class TaskItem
{
public int Id {get; set; }
public double Scores {get; set; }
}
}
}

If you do not ship directly into the memory of all, IMHO, only brute force and character work.

⭐ Разработка Андроид-приложений (Xamarin C#). ⭐ Разработка ASP.NET (WebForms, MVC, WebAPI, Core). ⭐ Цой жив!

To post a new comment, please log in or register